Perché Python per Mac OS X e CentOS Linux python hanno interpretazioni diverse degli escape di \ U nelle stringhe?

Due sessioni di interprete python. Il primo è di Python su CentOS. Il secondo è dal python integrato su Mac OS X 10.7. Perché la seconda sessione crea stringhe di lunghezza due dalla sequenza di escape \ U, e successivamente errore?Perché Python per Mac OS X e CentOS Linux python hanno interpretazioni diverse degli escape di U nelle stringhe?

$ python 
Python 2.6.6 (r266:84292, Dec 7 2011, 20:48:22) 
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2 
Type "help", "copyright", "credits" or "license" for more information. 
>>> u'\U00000020' 
u' ' 
>>> u'\U00000065' 
u'e' 
>>> u'\U0000FFFF' 
u'\uffff' 
>>> u'\U00010000' 
u'\U00010000' 
>>> len(u'\U00010000') 
1 
>>> ord(u'\U00010000') 
65536

$ python 
Python 2.6.7 (r267:88850, Jul 31 2011, 19:30:54) 
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin 
>>> u'\U00000020' 
u' ' 
>>> u'\U00000065' 
u'e' 
>>> u'\U0000FFFF' 
u'\uffff' 
>>> u'\U00010000' 
u'\U00010000' 
>>> len(u'\U00010000') 
2 
>>> ord(u'\U00010000') 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
TypeError: ord() expected a character, but string of length 2 found

fonte

2012-06-06 audiodude

io non sono affatto sicuro di questo, ma è possibile che il sistema Mac OS X utilizza un "accumulo stretta" di pitone che rappresenta unicode con solo 16 bit per interni la codifica di Unicode, e rappresenta i punti di codice unicode superiori a 2 ** 16 come una coppia di caratteri (il che spiegherebbe len(u'\U00010000') == 2.

Prova unichr(0x10000) su OS X e vedere se si ottiene un errore riferendosi a restringere costruisce. Vedi anche What encoding do normal python strings use?, in particolare la risposta IVH

È possibile ricompilare python per utilizzare una build estesa anche se il python predefinito sul sistema utilizza una build stretta.

fonte

2012-06-07 02:03:19

Buona cattura. Questo è probabilmente. Vedi anche questo articolo: http://wordaligned.org/articles/narrow-python – dda

Questa è la risposta giusta. Ottengo l'errore su "narrow python build" e sys.maxunicode restituisce 65535 su Mac OS X. – audiodude

@ user802500: potrei essere frainteso, ma non è Mac OS che ha la build stretta in questo caso? – fholo

Perché Python per Mac OS X e CentOS Linux python hanno interpretazioni diverse degli escape di \ U nelle stringhe?

risposta

Problemi correlati