Before and After Unicode Working with Polytonic Greek1 Donald Mastronarde U of California Berkeley After the age of the punch card when input and output for computing turned to the humanreadable form of numbers and letter these processes were very much Englishcentered Computer engineers and programmers had little interest in the needs of multilingual and multiscript texts When character sets other than plainvanilla US English did become available each set was limited in size to 256 items and the real limit is more like 220 For many purposes only 128 items could be handled more or less gracefully while the items in the upper half of a larger set might be ignored or misinterpreted by some programs When the TLG originally began digitizing polytonic Greek texts the betacode transcription and an elaborate collection of beta escapes were needed to cover the multitude of characters and symbols in specialized texts The introduction of the Macintosh in 1984 inaugurated a new era for those who wanted to use fonts for specialized scripts users could create and edit bitmap fonts and print them without difficulty on the ImageWriter Apple s dotmatrix printer and they could hack the Mac s system resources to adapt keyboard input for the customized font The late George Walsh recognized the potential benefit to Hellenists and released SMK GreekKeys soon after the appearance of the first Mac The same capability attracted users of other nonRoman scripts I This is a slight revision Feb 2008 of the presentation made at the panel Fonts Encodings WordProcessing and Publication a tutorial for classicists on fonts and Unicode at the Annual Meeting of the American Philological Association in Montreal January 2006 In this PDF version the illustrations from the PowerPoint slides are interspersed with the text Montreal APA Unicode Presentation revised Feb 2008 l Each font contains within it an encoding that is a scheme for identifying each character numerically The user strikes a key on the keyboard and thus generates a numeric code and this code tells the operating system to display the corresponding item in the current font In the many special fonts including those developed by George Walsh for GreekKeys the encoding was ad hoc invented by the enduser such fonts pretended to be organized as a standard Macintosh Roman font but in reality placed other characters in some or all of the 220 or so positions available in fonts at that time This eXplains why in such encodings you can change the font to a normal Roman font like Times and see ordinary Roman characters mixed with odd symbols or Roman letters with diacritics But if you tum one Greek font into another Greek font that has a different encoding the result is close to gibberish The incompatibilities between the different encodings are obvious and even more trouble may arise when a file is transferred from one platform to another Montreal APA Unicode Presentation revised Feb 2008 2 Q s G quot Documentl 3 r s 11 i Gi39ccchyscncoding Bospoms font E39L39G 654Est Apyo s My 5La7r7do 9aL 6Kd os KoAwa E afay Kvau as EvmrAnyci aS 77539 El urimuo39c nnN ov 77501211 7TOTE Same cht in Timcs font Eiy39 fcl EArgoEw mQ diaplz lsyai sk t ow Klevn w a Zvan kilan aw Snmplhgfidaw mhd n n paisi Phloou pcscgtn pore Same text convened to Snpchi cck E5 f f i ApyoUm n 8anlto CAL otltlt mw KLAEVquot to 61W Known Z Ulinmtkmi 1m 1 WICML 17er 1750611 ITOTEi Fig 1 GreekKeysencoding with and Without a GreekKeys font a a e r Documzml m Supchrcck encoding SuperGrcck font MW ev ywye 7an9 Kopog EBOIKOQ npfxTov hi95c enfn CW WT SL Game beamaw kg opeoc ESQaing 6be 8 6801 nyqioieuov Same cht in Times font hjrachhn mom egwgc Icou39c kovra aJnivka pra39ton linqcc cjma39i cum matri qcvloicj uJakivnqina fuvlla ejx orcoc drcvyzcqaicjgw dj onom anomovncuon Same cht converted to Bosporos nfpuwrivny new Eiys ys Tealti Kowpa aEmwm npa39rov ndim sri 66ml 51m harp szMsz EanKvahua vaAa x psazi 5psw aampat sfys 36755012 aEysttawvsvay Fig 2 SuperGreek encoding with and Without a SuperGreek font Look at the list of encodings that the webbased TLG or Perseus offers to cope with or that GreekKeys Converter offers to convert Montreal APA Unicode Presentation revised Feb 2008 Welcome to the Online TLG creek a pnlymnic Greek fonl nmsr be Insmlled on yum cumuurer DOWNLOAD GREEK FONTS lnpm and Greek Display aprinns on 2ferences in he numrnarically preselected in all sessions WM isir a User Profile in e reek LuclN DR BTABLISH A NEW USER PKDHLE Sign Fig 3 TLG font choices Perseus Display Con guration Use this page 0 Se defaults for various feamres 039 he Perseus text system as Perseusi Whe your selerrlans cllck on quotSet conilgurarlon Your browser may ask lr you wanl ro accepr a quott tan keepyour settings from one session to another Sucnn gumlnn l neser word may Links Do you wanr words linked m the Ward sruuy Tool This rool gives a morphological analysis rniarrnarian arm a slim diuionary derlnlrlcn for Greek and Larln words Far more Informa page 8 Yes Greek display What encoding du ynu wan for Greek rexr For mure Information see our iom helg gage e Larln rranslrrerarion upercreek Z Greekrransllreranon 3 Beta code C creekkeys Unicode llr98 1 Sgreek furWrndnws 1 Unicude UTF8 wirh pre cumblned accents r SP Ionic 7 m Fig 4 Perseus Greek display choices Montreal APA Unicode Presentation revised Feb 2008 8 O r EreekKeysConvmer w GreekKeys tMaG Unimrie RTF 8 i Qult Lasercveek Mac 7 IFAOGrec Mad SPlnmc Mac GreekPoly Mart laid i Clipboard WmGreek GreekKeys Windows Screek thdows nude l font styles 1 g E a 9 g 5 E urarcis twm owst BEmCode Fig 5 GreekKeysConverter encoding choices A secondary considemtion for those who want to use such non standard fonts is keyboard input In terms of ideal human interface design the input scheme should be a service ofthe operating system so that it is available in any application in which the user may wish to enter the specialized characters The Macintosh OS has always had such capability If you install the tmditional GreekKeys Universal keyboard resource or the GreekKeys Unicode Input then in principle such inputs are available to all applications that comply with protocols of the operating system Windows was slower to provide such each of use but now a Polytonic Greek keyboard using the old manualtypewriter scheme is supplied in Windows 2000XP and the tool Microso Keyboard Layout Creator allows the creation of customized keyboards although its current version leaves something to be desired in terms of versatility and its products even if designed with care Montreal APA Unicode Presentation revised Feb 2008 may be hampered by incompatibilities in Word for Windows With the help of version 14 for Microsoft Keyboard Layout Creator GreekKeys Unicode keyboards have been produced as part of GreekKeys 2008 that are very close in functionality to the Mac Unicode keyboards Keyboard layouts feature two styles of dealing with diacritics slide 7 From the beginning GreekKeys adopted the deadkey protocol this is the same mechanism by which on some manual typewriters an umlaut or acute or grave or tilde could be added to a standard Rom an character by entering it with a deadkey or nonadvancing key before the regular character An alternative choice is to use a ligature system or zerowidth diacritic characters entered after the regular character This is the technique used by SuperGreek While this avoids the need for a separate keyboard resource this method often produces less than ideal placement of the diacritics and the encoding of course remains nonstandard and therefore should be discouraged now that Unicode is robust enough and widely supported enough to be used for all polytonic Greek purposes 7quot Dgcumenu Input with Ira tinnal Grccchvs Uni a1 rccchys Unicode Type d i optinn72 a spaccbart nptionrl o spaccharf option 3 u n a i spaccbar nption74 cj spaccbar optional a g a y option v n 816x To 9va E c xycxedw l Input with Stlpchi cok font Type d i a spaccbart 0 spaccbarf u 39 n a i spaccbar cj x spaccbar aj g a L w 39 n 8121 no ft1m Ei W061 Fig 6 Options for entry of characters with diacritics Montreal APA Unicode Presentation revised Feb 2008 6 Now let me turn to Unicode slide 8 Unicode is an international standard that aims to provide a unique digital identification or code point for the characters and symbols used in most of the world s writing systems and publishing traditions slide 9 8 0 B u code Home Page E El I mmnummamrg Unimde Home Page New m Unicode What is omens How to Use rm Sllu F a Glossaw or Umtu o Terms Genera nfarmatian Whom ls m r vaclc 39Disnlay vomm 7 Usuml Rnsumrrs Umcuuv Enamuu Pruuuus Lrsrs Suhsm m Hm ummue wehsuo in mm ha us 39 Press mm PUNCH a Pusmuns Comm Us Members Only 39 Member RCSDL MCS Walking Dmumrms Wm mamas We in new mm m Buy the Book Some 039 Our Members YAHOO Government 0 lndla ms Anna Unlcooc CLDR Commm releases Unlcooe Tr sllmzmcn Gmuclmcs or hrawsc update lor Lqu m Ursam urnom Bprossmn r mm mm Fig 7 Unicodeorg home page s Umcuu Umhan Dara 4 Joel Les more The Unicode Standard ran Hmc Laws Vovsloh come i araum Javabasb ru s cifinatians v Baal LDRTeolmlcal ummnre UDHR m UmLDu L The code points are assigned as hexadecimal numbers of four or five digim Hexadecimal notation uses the arabic numerals 0 through 9 and the Latin letters A through F The fourdigit code poinm belong to the Basic Multilingual Plane or BMP or Plane 0 which was originally to contain everything in the standard slide 10 But as time has passed it has become Montreal APA Unicode Presentation revised Feb 2008 apparent that many more code points are needed for a truly global treatment of character sets and fivedigit code points have been introduced Unicode is also diVided into blocks sets of characters and symbols that are related or that were proposed as a group Unicode Code Points Character Name Sample Unicode hexadecimal code point Latin capital letter a A U004l Latin small letter a a U006l Greek capital letter alpha A U039l Greek small letter alpha 1 U03Bl Latin capital letter d D U0044 Latin small letter d d U0064 Cyrillic capital letter de A U04l 4 Cyrillic small letter de 21 U0434 Unicode Planes Basic Multilingual Plane BlVlP Plane 0 U0000 to UFFFF Supplementary SMP Plane 1 U10000 to UlFFFF Multilingual Plane Supplementary SIP Plane 2 U20000 to U2FFFF Ideographic Plane Montreal APA Unicode Presentation revised Feb 2008 8 Unicode Blocks of Interest to Classicists Name of Block Code points start at hexadecimal Basic Latin 0000 basic roman characters and simple punctuation type English and Latin Latin 1 Supplement I 0080 roman letters with acute grave circum ex umlaut and additional symbols type French German etc Latin ExtendedA 0100 roman vowels with macron or breve letters with diacritics used in Central European languages type Latin with macron or breve Combining Diacritical Marks I 0300 accents macron breve etc Greek and Coptic I 0370 Greek for monotonic Greek plus a few mathscience symbol versions of Greek and a few distinctive Coptic letters Latin Extended Additional I 1E00 roman letters with underdots underlines and the like Demotic Egyptian transliteration Greek Extended I 1F00 precomposed characters for polytonic Greek and some archaic letters General Punctuation I 2000 additional symbols such as obelus curved quotes Miscellaneous Technical Symbols I 2300 metrical symbols at 23D0ff Private Use Area I E000 used in GreekKeys Unicode fonts and some other scholarly fonts for precomposed characters like epsilon with circumflex alpha with breve and smooth and acute Linear B Syllabary 10000 Linear B Ideograms 10080 Aegean Numbers 10100 Ancient Greek Numbers 10140 Old Italic 10300 Cypriot Syllabary 10800 Byzantine Musical Symbols 1D000 Ancient Greek Musical Notation 1D200 Montreal APA Unicode Presentation revised Feb 2008 9 The early history of Unicode was marked by some poor choices made for reasons of politics economy ignorance or backwards compatibility Thus Greek was initially served by a block called Greek and Coptic starting at U0370 and this block served only the needs of monotonic Greek and was also considered an inadequate provision for Coptic by the Coptic community For scholarly purposes an additional block called Greek Extended was later added at UlF00 and this serves many but not all of the needs of those who deal in polytonic and ancient texts Additional needed characters occur in other blocks in the 41 and 50 version of the standard and Coptic has now received its own block Some of the special added symbols have been encoded in the Supplementary Multilingual Plane SMP or Plane 1 which can make their use problematic in some applications that have been written to expect and process only fourdigit code points from the Basic Multilingual Plane What difference does the maturing of Unicode make to the situation for scholars using polytonic Greek First there is a real hope of better and simpler communication in documents email and in browsers A user with a modern OS probably has a default system font containing both Greek and Greek Extended and modern programs are ready to display Unicode Greek without additional installations or fussy con guration Second there is also hope for some sort of longterm survivability or the best that we can do in this regard in the evershifting world of information technology and telecommunications what we create today using Unicode should remain compatible and readable for a long time And third it gives us an altemative to systems that are breaking down With Unicode the incompatibilities shown on the slide are gone and important features of the software are restored to usefulness Montreal APA Unicode Presentation revised Feb 2008 10 Incompatibilities of Traditional GreekKeys Encoding compulsory display of white space for the encoding that George Walsh had selected for omega with smooth and acute in modern applications con ict with autotext features of MS Word such as initial capitalization con ict with MS Word automated smart quotes feature MS Word from version 6 onwards unable to search correctly for many Greek letters with diacritics MS Word unable to interpret where Greek words begin and end In most of the remainder of this presentation I want to talk about the practicalities for those who are making the transition to Unicode Greek If you have typed whole books of alternating English and Greek as l have or just typed articles and handouts for classes with that same alternation then you perhaps arrived at a habitual practice similar to the following With traditional GreekKeys l assign a keyboard command in Word to the Roman font in whichl want to type English Latin etc and a different command in Word to the Greek font that I want to use I set the input to GreekKeys Universal and can leave it there almost always since most of the optional features of the Roman font such as umlauts and accents on vowels work normally with GreekKeys Universal 1 am compelled to change to the US keyboard only in the event that I need an option character on the top row such as the section symbol option6 or the en dash optionhyphen Here is an example showing the places where I need to issue a single keyboard command to toggle fonts at each transition between typing English and typing Greek Montreal APA Unicode Presentation revised Feb 2008 11 Typing with Traditional GreekKeys Forthe glance ofthe bull as emblematic offerocious anger see 188 below and Ar Frogs 804 of angry Aeschylus s msws yo u minnow eyktlwue m rm also related are the rolling eyes and askance gaze ofthe agitated resisting bull in H21 15578 or ofthe maddened Heracles likened to abull in Her 8689 and the playful use in Plato Phaedo 117b Socrates tlxrwsz rzu tart rubmoor moMom indicates change offont GreekKeys Universal keyboard is le active throughout the typing We saw earlier that traditional GreekKeys and SuperGreek masquerade as Roman fonts so that the Greek can be shifted to Roman characters When you use Unicode Greek however it is no longer the case that changing the Greek to another font will produce Roman characters Font Change with Unicode KadmosU font superlative with fog ovottoll 0139 the like understand eouv S 1 oul super ative with J39Jg vaml or the like understood converted to Lllelda Graudc superlative with dug 60von39ou or the like understood nitlieiiii n i in quot quot39 4 L orit will have only some as in the case ofafont containing afew letters used in math or science or afont containing only the original Greek block of Montreal APA Unicode Presentation revised Feb 2008 Unicode and not Greek Extended In some circumstances you will see boxes contain the needed code points converted to Helvetiea OS 104 version superlative with dog advoml or the like understood converted to Helvetiea OS 103 version superlative with EIEI EIEIEIEIEIEIEI or the like understood convened to Times New Roman in OS 103 superlative with Elg devotion or the like understand Fig 8 Font change with Unicode Sometimes however the operating system or app1ication will disp1ay the correct 1etters but in a dilferent font adefault system font that is used as the last resort to disp1ay rare code points In this case one o en cannot change the font ofthe characters at all ijx u E93511th 7 L LllL lLlil humilc mixed in t walla mu Fig 9 Mixed Fonts with Unicode The uniqueness ofthe digital identi cation ofeach character gives a platforms although there are sti11 afew g1itches because offaults in the OSes or in the app1ications What becomes a little more cumbersome on the other hand is the process of input when you are i requentiy changing from quot 39L 39 4L nchLJUI l ti ii i u enter meeu te I In Montreal APA Unicode Presentation revised Feb 2008 13 separate fonts as this can save a lot of work later in the process of editing and publishing In this case in order to type with GreekKeys Unicode or any other similar input you must make a double command at each transition in your writing Thus if I were to type the same sentences shown earlier with Unicode Greek I would have to change my input and change my font at each transition using two separate commands Typing with GreekKeys Unicode For the glance of the bull as emblematic of ferocious anger see 188 below and Ar Frogs 804 of angry Aeschylus 8 e yofw Tavanbv yKINas KdToo39 also related are the rolling eyes and askance gaze of the agitated resisting bull in H el 15578 or of the maddened Heracles likened to a bull in Her 8689 and the playful use in Plato Phaedo 1 17b Socrates amp3cr7rep 6206661 Tavanbv wo k xpa5 indicates change of input39 indicates change of font This represents a distinct change of habit from traditional GreekKeys and requires some getting used to If you are fairly sure your document will not need to have its Greek converted separately to a different font then you may of course limit yourself to only one command at each change for the input Unicode Greek fonts will normally include Roman characters as well so you simply select a font such as Apple s Times 104 version or KadmosU or Cardo and alternate between US and Greek inputs as needed Where do fonts and inputs get installed In Mac OS X the relevant folders are in two locations the toplevel Library folder and the Library folder in your home directory What you install in the former is available to all users of the computer while what you install in the latter is available to Montreal APA Unicode Presentation revised Feb 2008 14 rne panienra L users and nu hnuld never change remove or add items in this folder lulbrary E umzm antlnb 4n nxml cr FunKTL39m svZ a duly Libran Ohm mi undore lag rigr a i donaldma e ADDlltaunns e s s lt v mm x in i 5 Upon LEE in library E AUDlltatInns asn ammnnd i 7 osmium mom 1 deraillmeaerxr facurvlcuurmu Dzskm is wanes r a Dammzmx l Fonrcvllsmonr r ENuxmp r I 90mm i uns L Hnmlenmmzm i i made Ller i lndzxz i Mamas r r r Music s 1 Plumes is ubh ins r P Fig 10 Useraccessible Library Folders in Mac as X toplevel above users level below Montreal APA Unieode Presentation revised Feb 2008 Inpuw are placed in the folder Keyboard Layouts Oldfashioned Roman input resources like GreekKeys Universal are small les that have the suf x rsrc The newer Unicode inpuw are XML les with the suf x keylayout but are usually enclosed with related les in a special type of folder that has the suf x bundle Fonts go in one ofthe two useraccessible Fonts folders and in OS X they appear in various forms with various icons and suf xes In recent versions of OS X rather than installing fonts manually to the Fonts folders you should use FontBookapp to install new fonts and to remove duplications by verifying which version you want to be activated Use the Preview Menu of FontBook to select Show Font Info then examine the info to see which version of a duplicate font is more recent Ifthe most recent version has a black dot next to it it is being ignored you should select that font and then select the command Resolve Duplicates under the Edit menu Kayboard Layouts Dm Modi ed laquot 152005 57 PM Nov 5 2004 x 25 PM Nov 5 mom s 25 PM Nov 3 mm a 25 PM 7 GyeekKevstUS er Fig 11 Added Keyboard Layouts Montreal APA Unicode Presentation revised Feb 2008 m 4K5 aaaxaaaaxa 4K5 mu Bundle ResFo ResFo ResFo umenr 39 Aihemanmm Arrikauberaa on sun 2mm dionr Leer met au39 i H W i f 39 Bosuorosubera2nxl Kadmosuoem oil Ensnocm e new r 1 mm EostovaeakkEvs suit Cardenas iie39ir i a e rm wa39t teen iiisiiirzse m n Fig 12 Font types and icons Mac OS X When a new font is installed it is available to an application the next time that application is launched When a keyboard layout is installed the user must either log out and log back in or restart and then use the System Preference entitled International 6 System Prefetemes i i show Ali qquot Personal ionemm airman orrrmoe urn moon min Snuniaver Hardware CDSJDVDr Dixpinvs Energy KKVDDJNJ Prineran Sauna am More IHIEHIEI Hr Nztwnrk on Network QuiKkTimz Shanna Sysmm r1 9 Arrow lust mum Sa wnre xterm Summuisk omnn upa Arrerr Fig 13 International System Preference Montreal APA Unicode Presentation revised Feb 2008 17 When International is open click on the Input menu tab At the top of the list are useful palettes I recommend turning on the Character Palette and the Keyboard Viewer Internauona bird ShnwAll I Q i Language Formats Inpu Men Seleu the keyboard layouts moot memods and palettes for the lnpul menu n Null c o putt npl r a Icnatacter Palette aleue Umtone m 39 Palette Japanese QKeyhoard vewet Palette Unicode 7 HM nan Dan Keyboard Untcode ergnan Pashlo Keyboard Untcode 7 not Keyboard Unicode l L Arabt Keyboard Untcode l 7 c MallltrQWERTV Key oord Unmode A 7 AvmemamHM owmv Keyboard unmade 39 Select prevlcus Input source seso Selett next Input source lrl menu XVSpate m Input Saurii animus l e 7 Use one input source In all documents 8 Allow a dflerent Inpul SOUKE for eazh document I Esnow Input menu ln rnenu bar it Fig 14 Input Pane of International System Preference The rest of the scrolling list contains dozens of localized inputs Salad the keyboard layouts input methods and palettes for he rnout menu 5n ln tty II Keyboard Koman Keyboard Ko an Keyboard Ummde m Keyboard Unreode Keyboard Ummde Keyboard Unreode Keyboard Ummde Keyboard Unreode Keyboard Ummde A CVeekKeysUmmdeUth Keyboard Unreode Montreal APA Unicode Presentation revised Feb 2008 18 Seleet the Keyboard laynuts Input methoda and palettes for the mpxt menu on me Allnpullype strr t M WW u 7 eh creeKKey UnundelUS Keyboard Umtmlz creeiKey unryersanrrr Keyboard acreeKKey ersancm Keyboard Koman iicreeuKey unryersancerw Keyboard Roman areeKKey Unwei ansy Keyboard Roman ureeKKey unrversauswcu Keyboard Koman ar e Key Umvevsalmk b ard Koman M ureeKKey unrversauusr Keyboard Koman meekKEyrUmvasallm Keyboard Koman 7 asreeKKeywenr Keyboard Korrran Seleet tne keyboard layouts npul muhodsy and palettes fur the rnbut menu or Nam 4 Input typ Script r on Keyboard Unrtode u oheqwmv Keyboard Umtode Brurmsneowmrverc Keyboard Unrtode E Keyboard Koman M u 5 Extended Keyboard Unrtode 7 l r r Keyboard CynllK W1 UmdeHexlnpm Keyboard Unrtode m r Keyboard Umtode 7 r Keyboard Unrtode Fig 15 Three views of list of inputs in Input Pane Below the scrolling list are important settings The checkbox at the bottom makes the input menu icon appear in the menu bar mam mam hmmlrs Selett pmviuus mpm sourte xSpate Selett next input soarte m menu KYSPate m hqu soarteaurnan C o Ice in an dotuments AHow a different mbut nurm for eath dotument El Show mum menu ln menu bar Montreal APA Unicode Presentation revised Feb 2008 ork Help lt gt K 23 E l E US E US Extended GreekKeys UnicodeUS GreekKeys UniversalUS I Unicode Hex Input m a L 3939 Show Character Palette 1 Show Keyboard Viewer u U E Open International Fig 16 Settings in Input Pane and the Input Menu The other settings are for keyboard shortcuts to switch inputs without using the mouse The default settings up to Tiger are that commandoption spacebar will move one item down in the list of activated inputs and commandspacebar will toggle between the two types of input Roman vs Unicode more conveniently in Tiger commandspacebar will toggle between the two most recently used items In Tiger and later however Apple has interfered with this longstanding convention by trying to give the same keyboard shortcuts to Spotlight and a user may need to reassign the shortcuts which is easy enough to do click on the Keyboard Shortcuts button as seen in Fig 16 or use the Keyboard amp Mouse System Preference In Windows XP useradded fonts are installed by opening the Fonts Control Panel and using the Install new font command in the File menu of the Fonts window In Vista the same command is available by rightclicking in the Fonts windows Montreal APA Unicode Presentation revised Feb 2008 20 yr m pm PM 1a m x 1 t Sam EM 39 t mm H 9 v imam N m a or View rm Mk w OM J v enema EEEI id gm vac EQ QQQQQQQQ W m W W mm M new new not m m WW W w Q Q Q Q Q Q Q Q Q Q M Mr Mei me it Lu in New in in new M W W W Q Q Q Q Q 139 El Q 1 me to M mum We Wm Wm M WM new Mi mm um now we met new New new to W W w inn W not W quotW 2 Ir J Q Q Q WW rm WWW m m Wm M mm W W nnmmi WWWquot M Q Q Q Q Q Q Q Q Q Q Wm M am We ttmwmmm e m M new quotW W o W W Q Q Fig 17 Windows XP Control Panels Fonts and Fonts window The keyboards available in Windows are activated by a rather complicated sequence of steps With GreekKeys 2008 however the installer Montreal APA Unicode Presentation revised Feb 2008 21 itself activates the newlyinstalled keyboard in Windows XP or Vista it is best to restart a er the installation so that the keyboard choices in the language menu or language bar are accurately updated Manual activation will be needed for older versions of Windows or if another user on the computer wants to have the new keyboard available as wellZ There are a number of ways to enter Unicode chamcters in OS X First there are speci c inputs for many languages including Apple s own Greek polytonic input in 104 and higher and of course GreekKeys Unicode If you want to try out Apple s input you can activate it and then open the Keyboard Viewer to see how it is setup 0 n 0 25 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 M Fig 18 Keyboard Viewer showing Apple s Greek Polytonic As you can see this input places upsilon on the y key and theta on the u key but the basic letters are otherwise placedjust as in GreekKeys The orange keys in the Viewer indicate the deadkeys Holding down the shi andor option key while the Viewer is open will show capitals and option chamcters and additional deadkeys 2 The steps to take are explained at the GreekKeys support site Montreal APA Unicode Presentation revised Feb 2008 22 When you know the proper code point but do not know whether there is an input that would make entry of this code point easy the solution is to use the input called Unicode HeX Input rk Help ltgt g 3 US US Extended Creek Polytonic CreekKeys UnicodeUS GreekKeys UniversaHUS Unicode Hex Input lc all Ell ll I Show Character Palette El Show Keyboard Viewer Open International Fig 19 Selecting Unicode HeX input As long as the code point is four digits that is within the Basic Multilingual Plane you can activate Unicode HeX then hold down the option key continuously while you type the four hexadecimal digits For instance typing 03e2 will get you Coptic Capital Letter Shei if you have a scholarly font that contains this code point which is not in any of the OS X system fonts39 it is in some of the fonts installed with MS Office 2008 There is a somewhat similar utility on the Windows side but is internal to MS Word for Windows and does not apply more broadly in this case you would type 03e2 into the document and they would appear on the screen and then before entering anything else after the final 2 press the lefthand ALTX the 03e2 in your document will be transformed into the capital shei again if you have a font containing it The ALTX key deVice for Windows is also Montreal APA Unicode Presentation revised Feb 2008 23 capable of handling vedigit code points in Plane 1 Unicode Hex input does not handle SMP Another way to nd and insert unusual Unicode characters is to use the OS X Character Palette or the Windows Character Map The Character Palette can be set to see characters in various sets The most useful settings are perhaps those for Code Tables and for Glyph With the former you can see all the blocks of Unicode arranged in order with the latter you can see the inventory of all the glyphs in a given font n 0 mm W m N am 1 JL H h 12 y u I r A m C J C j M N D p i Q a a a 61 3 a a 1 z z 1 L a 5 U u U m a a I a a a a a a a 139 Z s L 9 o y u y 9 m l l l l l l l l l l m m f a 6 tr 6 6 a E c T T 7 T o o o O E E g g H H m r 1 l J F39 F H F1 W F V varau rm n M d h r 7 7 7 7 7 r V C 2 9 U C g c c m WT q Fig 20 Glyph view in OS X Character Palette You can search by code point or by a part of the of cial Unicode name if you know it for instance if you are looking for a small s with circum ex you can search for letter s or for circum ex and locate it Then you can determine which of your installed fonts include this character The Character Montreal APA Unicode Presentation revised Feb 2008 24 Palette can be used to insert the SMP code points which Unicode HeX input does not handle Windows Character Map is a small application located in ProgramsAccessoriesSystem Tools displaying the characters in a specified font in Unicode code point order First set the font and then select a character and click the Copy button so that you can paste it into a document of your choice If you know the code point you can enable Advanced View and enter the code point to locate the character in the table Character Map has limitations SlVIP fivedigit code points are not available for selection under either XP or Vista under XP this utility shows only the BMP four digit code points that were present in Unicode 40 and ignores Unicode 41 and 50 and pipeline characters that may actually be present in the font39 under Vista however it shows all BIVIP code points present in the selected font including pipeline characters Another useful utility in Windows that sometimes outperforms the Character Map is the Insert Symbol palette of MS Word or the comparable Insert Special Character palette of OpenOffice Writer With this palette you can specify the font you want to draw from and the characters will be displayed in Unicode order There are limitations and inconsistencies in how well these work depending on whether you are running XP or Vista and on what version of Word is used and perhaps other factors I have not been able to identify First fivedigit code points are not shown in Word s palette and OpenOffice showed some inconsistency between setups so the ALTX method may be the only way to do so Second this palette may fail to display even some of the 4digit code points added in recent versions of the Unicode standard or may display them only after you search for them by code point Montreal APA Unicode Presentation revised Feb 2008 25 symbul E ivmhals Recenth used 5mm avarievmde luans mmmnmm WWW WWW j shavtmtkev 5mm waged 6 Fig 21 Insert Symbol Palette of MS Word for Windows I want to conclude by refening brie y to the problems and prospects for use of Unicode First for much of this decade there have been applications that are defective in their treatment of Unicode Quark XPress before version 7 PowerPoint 2004 for OS X various browsers WordPerfect Second most applications have been unable to take advantage of smart features in modern fonts Microsoft Word 2004 for OS X was a big offender in this As of early 2008 the OSes major applications and major browsers especially if you are ninning the latest versions which for reasons Montreal APA Unicode Presentation revised Feb 2008 of security everyone should try to do have improved greatly3 The major inconvenience now will be when users with the latest software share items with users who have not upgraded In the future fonts are supposed to become smarter this means that they will contain intemal mechanisms for displaying the correct precomposed glyph even when a decomposed sequence of Unicode code points is used In fact the roadmap of the Unicode Consortium provides that eventually all data should be maintained in decomposed form using only base character and combining diacritics That means that eventually one should not use the Greek Extended block to express polytonic Greek but only the original Greek block with combining diacritics At present however when you want to use a nonstandard combination such as smooth and circum ex over omicron although it is certainly possible to express the desired character as a combination of official Unicode code points the result you see on your screen or printed on paper is likely to be unacceptable if in fact the application allows you to see anything at all That is why for the moment GreekKeys Unicode fonts in agreement with some other fonts produced for scholars use Private Use Area code points for a number of special characters despite the fact that using PUA interferes with the universality of access to the characters But smart font features are now also supported by GreekKeys 2008 and crossplatform use of at least partially decomposed input is becoming gradually more practical Unfortunately there are two systems for creating smart features one from Apple AAT Apple Advanced Typography and another from Adobe and Microsoft 3 With Firefox good support of Unicode Greek and OpenType will arrive with the release of version 3 The version 3 beta release already is vastly superior to Firefox 2x which sometime does very odd things to polytonic Greek Montreal APA Unicode Presentation revised Feb 2008 27 OpenType and until recently a font maker who wants to serve a cross platform audience needed to include both types of features Apple however is now gradually extending OpenType support in OS X so OpenType features alone will soon work broadly enough4 Precomposed form generally used now U1F87 Decomposed form eventually to be preferred U03B1 alpha U0314 combining rough breathing U0342 combining circum ex accent U0345 combining iota subscript TextEdil uses smart features Font precomposed decomposed BosporosU a Q Q L BosporosU a a L n Lueida Grande amp Lucid T T L Grande 9 2 NAU a a 9quot Times A 1 9 9 Fig 22 Precomposed vs Decomposed 4 The compatibility of various applications with OpenType features is tracked on the GreekKeys support site httnsweh le herkelev 4 uu htm Montreal APA Unicode Presentation revised Feb 2008 28 Some URLs for further information GREEKKEYS SUPPORT SITE main page httpswebfilesberkeleyeduNpinaXgreekkeys Troubleshooting GreekKeys 2005 and 2008 httpswebfilesberkeleyeduNpinaXgreekkeysGKUFAg 2machtml httpswebfilesberkeleyeduNpinaXgreekkeysGKUFAg 2winhtm1 GREEKKEYS SALES SITE httpWWW esellerate ct 39 keVs NOTE APA members can get discount on individual licenses by accessing a private sales page through the Members Only link at the APA web site UNICODE main page httpwwwnnicodeorg code charts PDF for each block httpwwwunicodeorgcharts searching for chaIt when you have a codepoint httpwwwunicodeorgcharts searching for chart or codepoint by character name not a complete list httpwwwunicodeorgchartscharindexhtml new characters in the pipeline for approval and inclusion in the standard httpwwwunicodeorgallocPipelinehtml TLG s Beta to Unicode Quick Guide httpwwwtlgucieduguickbetapdf Script Encoding Initiative httpwwwlinguisticsberkeleyedusei Montreal APA Unicode Presentation reVised Feb 2008 29


