diff --git a/doc/CC_eng_tha.txt b/doc/CC_eng_tha.txt deleted file mode 100644 index 3d33db6bd..000000000 --- a/doc/CC_eng_tha.txt +++ /dev/null @@ -1,1698 +0,0 @@ - -mkText (mkPhr (mkQS (mkCl she_NP sleep_V))) questMarkPunct (mkText (mkPhr yes_Utt) fullStopPunct) -does she sleep ? yes . -หล่อนนอนหลับใช่ไหม ใช่ - -mkText yes_Utt -yes . -ใช่ - -mkText (mkS pastTense (mkCl she_NP sleep_V)) -she slept . -หล่อนนอนหลับ - -mkText (mkCl she_NP sleep_V) -she sleeps . -หล่อนนอนหลับ - -mkText (mkQS pastTense (mkQCl (mkCl she_NP sleep_V))) -did she sleep ? -หล่อนนอนหลับไหม - -mkText negativePol (mkImp sleep_V) -don't sleep ! -อย่านอนหลับ - -mkText (mkText (mkPhr (mkUtt where_IAdv)) questMarkPunct (mkText (mkPhr (mkUtt here_Adv)))) (mkText (mkPhr (mkUtt when_IAdv)) questMarkPunct (mkText (mkPhr (mkUtt now_Adv)) exclMarkPunct)) -where ? here . when ? now ! -ที่ไหนที่นี่เมื่อไรเดี๋ยวนี้ -- spaces - -mkPhr but_PConj (mkUtt (mkImp sleep_V)) (mkVoc (mkNP i_Pron friend_N)) -but sleep , my friend -แต่นอนหลับซิเพี่อนของฉัน - -mkPhr (mkS futureTense negativePol (mkCl she_NP sleep_V)) -she won't sleep -หล่อนไม่นอนหลับ -- place of negation (but possible) - -mkPhr (mkCl she_NP sleep_V) -she sleeps -หล่อนนอนหลับ - -mkPhr (mkQS conditionalTense (mkQCl (mkCl she_NP sleep_V))) -would she sleep -หล่อนนอนหลับไหม - -mkPhr (mkImp sleep_V) -sleep -นอนหลับซิ - -mkPhr (mkPConj and_Conj) (mkUtt now_Adv) -and now -และเดี๋ยวนี้ -- tone mark - -mkPhr yes_Utt (mkVoc (mkNP i_Pron friend_N)) -yes , my friend -ใช่เพี่อนของฉัน - -mkUtt (mkS pastTense (mkCl she_NP sleep_V)) -she slept -หล่อนนอนหลับ - -mkUtt (mkCl she_NP sleep_V) -she sleeps -หล่อนนอนหลับ - -mkUtt (mkQS pastTense negativePol (mkQCl who_IP sleep_V)) -who didn't sleep -ใครไม่นอนหลับ -- neg place not preferred - -mkUtt (mkQCl who_IP sleep_V) -who sleeps -ใครนอนหลับ - -mkUtt pluralImpForm negativePol (mkImp (mkVP man_N)) -don't be men -อย่าเป็นชาย - -mkUtt who_IP -who -ใคร -- ay - -mkUtt why_IAdv -why -ทำไม -- th - -mkUtt (mkNP this_Det man_N) -this man -ชายคนนี้ - -mkUtt here_Adv -here -ที่นี่ - -mkUtt (mkVP sleep_V) -to sleep -นอนหลับ - -mkUtt (mkCN beer_N) -beer -เบียร์ - -mkUtt (mkAP good_A) -good -ดี - -mkUtt (mkCard (mkNumeral n5_Unit)) -five -ห้า - -mkPhr (lets_Utt (mkVP sleep_V)) -let's sleep -นอนหลับ - -mkS positivePol (mkCl she_NP sleep_V) -she sleeps -หล่อนนอนหลับ - -mkS negativePol (mkCl she_NP sleep_V) -she doesn't sleep -หล่อนไม่นอนหลับ -- neg place - -mkS simultaneousAnt (mkCl she_NP sleep_V) -she sleeps -หล่อนนอนหลับ - -mkS anteriorAnt (mkCl she_NP sleep_V) -she has slept -หล่อนนอนหลับ - -mkS presentTense (mkCl she_NP sleep_V) -she sleeps -หล่อนนอนหลับ - -mkS pastTense (mkCl she_NP sleep_V) -she slept -หล่อนนอนหลับ - -mkS futureTense (mkCl she_NP sleep_V) -she will sleep -หล่อนจะนอน -- ca !! - -mkS conditionalTense (mkCl she_NP sleep_V) -she would sleep -หล่อนนอนหลับ - -mkUtt singularImpForm (mkImp (mkVP man_N)) -be a man -เป็นชายซิ - -mkUtt pluralImpForm (mkImp (mkVP man_N)) -be men -เป็นชายซิ - -mkUtt politeImpForm (mkImp (mkVP man_N)) -be a man -เป็นชายซิ - -mkS conditionalTense anteriorAnt negativePol (mkCl she_NP sleep_V) -she wouldn't have slept -หล่อนไม่ได้นอนหลับ -- daay !! - -mkS and_Conj (mkS (mkCl she_NP sleep_V)) (mkS (mkCl i_NP run_V)) -she sleeps and I run -หล่อนนอนหลับและฉันวิ่ง - -mkS and_Conj (mkListS (mkS (mkCl she_NP sleep_V)) (mkListS (mkS (mkCl i_NP run_V)) (mkS (mkCl (mkNP youSg_Pron) walk_V)))) -she sleeps , I run and you walk -หล่อนนอนหลับ ฉันวิ่งและคุณเดิน -- comma - -mkS today_Adv (mkS (mkCl she_NP sleep_V)) -today she sleeps -วันนี้หล่อนนอนหลับ - -mkCl she_NP sleep_V -she sleeps -หล่อนนอนหลับ - -mkCl she_NP love_V2 he_NP -she loves him -หล่อนรักเขา - -mkCl she_NP send_V3 it_NP he_NP -she sends it to him -หล่อนส่งมันให้กับเขา - -mkCl she_NP want_VV (mkVP sleep_V) -she wants to sleep -หล่อนอยากนอนหลับ - -mkCl she_NP say_VS (mkS (mkCl i_NP sleep_V)) -she says that I sleep -หล่อนพูดว่าฉันนอนหลับ -- wa is complaining - -mkCl she_NP wonder_VQ (mkQS (mkQCl who_IP sleep_V)) -she wonders who sleeps -หล่อนประหลาดใจว่าใครนอนหลับ - -mkCl she_NP become_VA old_A -she becomes old -หล่อนกลายเป็นคนแก่ -- classifier !! - -mkCl she_NP become_VA (mkAP very_AdA old_A) -she becomes very old -หล่อนกลายเป็นคนแก่มาก -- classifier !! - -mkCl she_NP paint_V2A it_NP red_A -she paints it red -หล่อนทามันสีแดง - -mkCl she_NP paint_V2A it_NP (mkAP red_A) -she paints it red -หล่อนทามันสีแดง - -mkCl she_NP answer_V2S he_NP (mkS (mkCl we_NP sleep_V)) -she answers to him that we sleep -หล่อนตอบเขาว่าเรานอนหลับ - -mkCl she_NP ask_V2Q he_NP (mkQS (mkQCl who_IP sleep_V)) -she asks him who sleeps -หล่อนถามเขาว่าใครนอนหลับ - -mkCl she_NP beg_V2V he_NP (mkVP sleep_V) -she begs him to sleep -หล่อนขอเขานอนหลับ - -mkCl she_NP old_A -she is old -หล่อนแก่ - -mkCl she_NP old_A he_NP -she is older than he -หล่อนแก่กว่าเขา - -mkCl she_NP married_A2 he_NP -she is married to him -หล่อนแต่งงานแล้วกับเขา - -mkCl she_NP (mkAP very_AdA old_A) -she is very old -หล่อนแก่มาก - -mkCl she_NP (mkNP the_Det woman_N) -she is the woman -หล่อนเป็นหญิง - -mkCl she_NP woman_N -she is a woman -หล่อนเป็นหญิง - -mkCl she_NP (mkCN old_A woman_N) -she is an old woman -หล่อนเป็นหญิงแก่ - -mkCl she_NP here_Adv -she is here -หล่อนอยู่ที่นี่ - -mkCl she_NP (mkVP always_AdV (mkVP sleep_V)) -she always sleeps -หล่อนนอนหลับเสมอ -- not forever - -mkCl house_N -there is a house -มีบ้าน - -mkCl (mkNP many_Det house_N) -there are many houses -มีบ้านหลายหลัง - -mkCl she_NP (mkRS (mkRCl which_RP (mkVP sleep_V))) -it is she who sleeps -หล่อนเป็นคนที่นอนหลับ -- not classifier !! - -mkCl here_Adv (mkS (mkCl she_NP sleep_V)) -it is here that she sleeps -หล่อนนอนหลับที่นี่ -- perhaps thini khue, otherwise thini in the end !! - -mkCl rain_V0 -it rains -มีฝน - -mkCl (progressiveVP (mkVP rain_V0)) -it is raining -กำลังมีฝน - -mkCl (mkSC (mkS (mkCl she_NP sleep_V))) (mkVP good_A) -that she sleeps is good -ว่าหล่อนนอนหลับดี - -mkS (genericCl (mkVP sleep_V)) -one sleeps -นอนหลับ - -mkUtt (mkVP sleep_V) -to sleep -นอนหลับ - -mkUtt (mkVP love_V2 he_NP) -to love him -รักเขา - -mkUtt (mkVP send_V3 it_NP he_NP) -to send it to him -ส่งมันให้กับเขา -- also without kab - -mkUtt (mkVP want_VV (mkVP sleep_V)) -to want to sleep -อยากนอนหลับ - -mkUtt (mkVP know_VS (mkS (mkCl she_NP sleep_V))) -to know that she sleeps -รู้ว่าหล่อนนอนหลับ - -mkUtt (mkVP wonder_VQ (mkQS (mkQCl who_IP sleep_V))) -to wonder who sleeps -ประหลาดใจว่าใครนอนหลับ -- ay - -mkUtt (mkVP become_VA (mkAP red_A)) -to become red -กลายเป็นสีแดง - -mkUtt (mkVP paint_V2A it_NP (mkAP red_A)) -to paint it red -ทามันเป็นสีแดง -- red was missing - -mkUtt (mkVP answer_V2S he_NP (mkS (mkCl she_NP sleep_V))) -to answer to him that she sleeps -ตอบเขาว่าหล่อนนอนหลับ -- missing compl - -mkUtt (mkVP ask_V2Q he_NP (mkQS (mkQCl who_IP sleep_V))) -to ask him who sleeps -ถามเขาว่าใครนอนหลับ -- missing compl - -mkUtt (mkVP beg_V2V he_NP (mkVP sleep_V)) -to beg him to sleep -ขอเขาให้นอนหลับ -- compl - -mkUtt (mkVP old_A) -to be old -แก่ - -mkUtt (mkVP old_A he_NP) -to be older than he -แก่กว่าเขา - -mkUtt (mkVP married_A2 he_NP) -to be married to him -แต่งงานแล้วกับเขา - -mkUtt (mkVP (mkAP very_AdA old_A)) -to be very old -แก่มาก - -mkUtt (mkVP woman_N) -to be a woman -เป็นหญิง -- pu ying !! - -mkUtt (mkVP (mkCN old_A woman_N)) -to be an old woman -เป็นหญิงแก่ -- pu !! - -mkUtt (mkVP (mkNP the_Det woman_N)) -to be the woman -เป็นหญิง - -mkUtt (mkVP here_Adv) -to be here -อยู่ที่นี่ - -mkUtt (mkVP (mkVP sleep_V) here_Adv) -to sleep here -นอนหลับที่นี่ - -mkUtt (mkVP always_AdV (mkVP sleep_V)) -always to sleep -นอนหลับเสมอ -- not forever - -mkUtt (mkVP (mkVPSlash paint_V2A (mkAP black_A)) it_NP) -to paint it black -ทามันเป็นสีดำ -- compl - -mkUtt (mkVP (mkVPSlash paint_V2A (mkAP black_A))) -to paint itself black -ทาตัวเองเป็นสีดำ -- compl - -mkUtt (mkVP (mkComp (mkAP warm_A))) -to be warm -อุ่น - -mkUtt (reflexiveVP love_V2) -to love itself -รักตัวเอง - -mkUtt (reflexiveVP (mkVPSlash paint_V2A (mkAP black_A))) -to paint itself black -ทาตัวเองเป็นสีดำ -- compl - -mkUtt (passiveVP love_V2) -to be loved -ถูกรัก - -mkUtt (passiveVP love_V2 she_NP) -to be loved by her -ถูกหล่อนรัก -- agent before verb !! - -mkUtt (progressiveVP (mkVP sleep_V)) -to be sleeping -กำลังนอนหลับ - -mkComp (mkAP old_A) -old -แก่ - -mkComp (mkNP this_Det man_N) -this man -ชายคนนี้ - -mkComp here_Adv -here -อยู่ที่นี่ - -mkSC (mkS (mkCl she_NP sleep_V)) -that she sleeps -ว่าหล่อนนอนหลับ - -mkSC (mkQS (mkQCl who_IP sleep_V)) -who sleeps -ว่าใครนอนหลับ -- ay - -mkSC (mkVP sleep_V) -to sleep -นอนหลับ - -mkImp (mkVP (mkVP come_V) (mkAdv to_Prep (mkNP i_Pron house_N))) -come to my house -มาถืงบ้านของฉันซิ - -mkImp come_V -come -มาซิ - -mkImp buy_V2 it_NP -buy it -ซื้อมันซิ - -mkUtt (mkNP this_Quant man_N) -this man -ชายคนนี้ - -mkUtt (mkNP this_Quant (mkCN old_A man_N)) -this old man -ชายแก่คนนี้ - -mkUtt (mkNP this_Quant (mkNum (mkNumeral n5_Unit)) (mkCN old_A man_N)) -these five old men -ชายแก่ห้าคนนี้ - -mkUtt (mkNP this_Quant (mkNum (mkNumeral n5_Unit)) man_N) -these five men -ชายห้าคนนี้ - -mkUtt (mkNP (mkDet the_Quant (mkNum (mkNumeral n5_Unit))) (mkCN old_A man_N)) -the five old men -ชายแก่ห้าคน - -mkUtt (mkNP (mkDet the_Quant (mkNum (mkNumeral n5_Unit))) man_N) -the five men -ชายห้าคน - -mkUtt (mkNP (mkNumeral (tenfoldSub100 n5_Unit)) (mkCN old_A man_N)) -fifty old men -ชายแก่ห้าสิบคน - -mkUtt (mkNP (mkNumeral (tenfoldSub100 n5_Unit)) man_N) -fifty men -ชายห้าสิบคน - -mkUtt (mkNP (mkDigits n5_Dig (mkDigits n1_Dig)) (mkCN old_A man_N)) -5 1 old men -ชายแก่ ๕๑ คน -- space around number !! - -mkUtt (mkNP (mkDigits n5_Dig (mkDigits n1_Dig)) man_N) -5 1 men -ชาย ๕๑ คน -- space - -mkUtt (mkNP i_Pron (mkCN old_A man_N)) -my old man -ชายแก่ของฉัน - -mkUtt (mkNP i_Pron man_N) -my man -ชายของฉัน - -mkUtt (mkNP paris_PN) -Paris -ปารีส - -mkUtt (mkNP we_Pron) -we -เรา - -mkUtt (mkNP this_Quant) -this -นี้ - -mkUtt (mkNP this_Quant (mkNum (mkNumeral n5_Unit))) -these five -ห้านี้ - -mkUtt (mkNP (mkDet the_Quant (mkNum (mkNumeral n5_Unit)) (mkOrd good_A))) -the five best -ห้าที่ดีที่สุด -- thi as classifier !! - -mkUtt (mkNP (mkCN old_A beer_N)) -old beer -เบียร์เก่า -- old for inanimate - -mkUtt (mkNP beer_N) -beer -เบียร์ - -mkUtt (mkNP only_Predet (mkNP this_Det woman_N)) -only this woman -หญิงคนนี้เท่านั้น -- only comes last !! - -mkUtt (mkNP (mkNP the_Det man_N) see_V2) -the man seen -ชายเห็น - -mkUtt (mkNP (mkNP paris_PN) today_Adv) -Paris today -ปารีสวันนี้ - -mkUtt (mkNP (mkNP john_PN) (mkRS (mkRCl which_RP (mkVP walk_V)))) -John , who walks -จอห์นคนที่เดิน -- classifier !! - -mkUtt (mkNP or_Conj (mkNP this_Det woman_N) (mkNP john_PN)) -this woman or John -หญิงคนนี้หริอจอห์น - -mkUtt (mkNP or_Conj (mkListNP (mkNP this_Det woman_N) (mkListNP (mkNP john_PN) i_NP))) -this woman , John or I -หญิงคนนี้ จอห์นหริอฉัน -- space, no comma - -mkUtt i_NP -I -ฉัน - -mkUtt you_NP -you -คุณ - -mkUtt youPol_NP -you -คุณ - -mkUtt he_NP -he -เขา - -mkUtt she_NP -she -หล่อน - -mkUtt it_NP -it -มัน - -mkUtt we_NP -we -เรา - -mkUtt youPl_NP -you -คุณ - -mkUtt they_NP -they -เขาทั้งหลาย -- more than one - - -mkUtt this_NP -this -นี้ - -mkUtt that_NP -that -นั้น - -mkUtt these_NP -these -เหล่านี้ -- more than one !! - -mkUtt those_NP -those -เหล่านั้น -- more than one !! - -mkUtt (mkNP the_Det house_N) -the house -บ้าน - -mkUtt (mkNP a_Det house_N) -a house -บ้าน - -mkUtt (mkNP theSg_Det house_N) -the house -บ้าน - -mkUtt (mkNP thePl_Det house_N) -the houses -บ้าน - -mkUtt (mkNP aSg_Det woman_N) -a woman -หญิง - -mkUtt (mkNP aPl_Det woman_N) -women -หญิง - -mkUtt (mkNP this_Det woman_N) -this woman -หญิงคนนี้ - -mkUtt (mkNP that_Det woman_N) -that woman -หญิงคนนั้น - -mkUtt (mkNP these_Det woman_N) -these women -หญิงเหล่านี้ -- more than one !! - -mkUtt (mkNP those_Det woman_N) -those women -หญิงเหล่านั้น -- more !! - -mkUtt (mkNP (mkQuant i_Pron) house_N) -my house -บ้านของฉัน - -mkUtt (mkNP the_Quant house_N) -the house -บ้าน - -mkUtt (mkNP a_Quant house_N) -a house -บ้าน - -mkNum (mkNumeral (tenfoldSub100 n2_Unit)) -twenty -ยี่สิบ - -mkNum (mkDigits n2_Dig (mkDigits n1_Dig)) -2 1 -๒๑ - -mkNum (mkCard almost_AdN (mkCard (mkNumeral n5_Unit))) -almost five -เกิอบห้า -- before numeral !! - -mkNum (mkCard almost_AdN (mkCard (mkNumeral n5_Unit))) -almost five -เกิอบห้า -- before - -mkCard (mkNumeral n7_Unit) -seven -เจ็ด - -mkOrd small_A -smallest -เล็กที่สุด - -mkCard (mkAdN more_CAdv) (mkCard (mkNumeral n8_Unit)) -more than eight -มากกว่าแปด -- before numeral !! - -mkNumeral (mkSub1000 n9_Unit (mkSub100 n9_Unit n9_Unit)) -nine hundred and ninety - nine -เก้าร้อยเก้าสิบเก้า -- tone mark for nine - -mkNumeral (mkSub1000 n9_Unit (mkSub100 n9_Unit n9_Unit)) (mkSub1000 n9_Unit (mkSub100 n9_Unit n9_Unit)) -nine hundred and ninety - nine thousand nine hundred and ninety - nine -เก้าแสนเก้าหมื่นเก้าพันเก้าร้อยเก้าสิบเก้า -- tone - -thousandfoldNumeral (mkSub1000 n9_Unit (mkSub100 n9_Unit n9_Unit)) -nine hundred and ninety - nine thousand -เก้าแสนเก้าหมื่นเก้าพัน -- tone - -mkNumeral (mkSub1000 (mkSub100 n9_Unit n9_Unit)) -ninety - nine -เก้าสิบเก้า -- tone - -mkNumeral (mkSub1000 n9_Unit) -nine hundred -เก้าร้อย -- tone - -mkNumeral (mkSub1000 n9_Unit (mkSub100 n9_Unit n9_Unit)) -nine hundred and ninety - nine -เก้าร้อยเก้าสิบเก้า -- tone - -mkSub100 n8_Unit -eight -แปด - -mkSub100 n8_Unit n3_Unit -eighty - three -แปดสิบสาม - -mkSub100 n8_Unit -eight -แปด - -mkNumeral n1_Unit -one -หนึง - -mkNumeral n2_Unit -two -สอง - -mkNumeral n3_Unit -three -สาม - -mkNumeral n4_Unit -four -สี่ - -mkNumeral n5_Unit -five -ห้า - -mkNumeral n6_Unit -six -หก - -mkNumeral n7_Unit -seven -เจ็ด - -mkNumeral n8_Unit -eight -แปด - -mkNumeral n9_Unit -nine -เก้า -- tone - -mkDigits n4_Dig -4 -๔ - -mkDigits n1_Dig (mkDigits n2_Dig (mkDigits n3_Dig (mkDigits n3_Dig (mkDigits n4_Dig (mkDigits n8_Dig (mkDigits n6_Dig)))))) -1 , 2 3 3 , 4 8 6 -๑ ๒๓๓ ๔๘๖ -- commas if amount of money etc !! - -mkCN house_N -house -บ้าน - -mkCN mother_N2 (mkNP the_Det king_N) -mother of the king -แม่ของพระราชา -- khong - -mkCN distance_N3 (mkNP this_Det city_N) (mkNP paris_PN) -distance from this city to Paris -ระยะทางจากเมืองเมืองนี้ถืงปารีส - -mkCN mother_N2 -mother -แม่ - -mkCN distance_N3 -distance -ระยะทาง - -mkCN big_A house_N -big house -บ้านใหญ่ - -mkCN big_A (mkCN blue_A house_N) -big blue house -บ้านสีน้ำเงินหลังใหญ่ -- better with classifier !! - -mkCN (mkAP very_AdA big_A) house_N -very big house -บ้านใหญ่มาก - -mkCN (mkAP very_AdA big_A) (mkCN blue_A house_N) -very big blue house -บ้านสีน้ำเงินหลังใหญ่มาก -- better with classifier - -mkCN man_N (mkRS (mkRCl which_RP she_NP love_V2)) -man whom she loves -ชายที่หล่อนรัก - -mkCN (mkCN old_A man_N) (mkRS (mkRCl which_RP she_NP love_V2)) -old man whom she loves -ชายแก่ที่หล่อนรัก - -mkCN house_N (mkAdv on_Prep (mkNP the_Det hill_N)) -house on the hill -บ้านบนเนินเขา - -mkCN (mkCN big_A house_N) (mkAdv on_Prep (mkNP the_Det hill_N)) -big house on the hill -บ้านใหญ่บนเนินเขา - -mkNum (mkCard almost_AdN (mkCard (mkNumeral n5_Unit))) -almost five -เกิอบห้า -- order - -mkNum (mkCard almost_AdN (mkCard (mkNumeral n5_Unit))) -almost five -เกิอบห้า -- order - -mkCN (mkCN reason_N) (mkVP sleep_V) -reason to sleep -เหตุที่นอนหลับ -- add thi !! - -mkCN (mkCN reason_N) (mkVP sleep_V) -reason to sleep -เหตุที่นอนหลับ -- same as prev - -mkCN king_N (mkNP john_PN) -king John -พระราชาจอห์น - -mkCN (mkCN old_A king_N) (mkNP john_PN) -old king John -พระราชาแก่จอห์น - -mkAP warm_A -warm -อุ่น - -mkAP warm_A (mkNP paris_PN) -warmer than Paris -อุ่นกว่าปารีส - -mkAP married_A2 she_NP -married to her -แต่งงานแล้วกับหล่อน - -mkAP married_A2 -married -แต่งงานแล้ว - -mkCl (mkVP (mkAP (mkAP good_A) (mkS (mkCl she_NP sleep_V)))) -it is good that she sleeps -ดีว่าหล่อนนอนหลับ - -mkCl (mkVP (mkAP (mkAP uncertain_A) (mkQS (mkQCl who_IP sleep_V)))) -it is uncertain who sleeps -ลังเลใจว่าใครนอนหลับ - -mkCl she_NP (mkAP (mkAP ready_A) (mkVP sleep_V)) -she is ready to sleep -หล่อนพร้อมนอนหลับ - -mkCl she_NP (mkAP (mkAP ready_A) (mkSC (mkVP sleep_V))) -she is ready to sleep -หล่อนพร้อมนอนหลับ - -mkAP very_AdA old_A -very old -แก่มาก - -mkAP very_AdA (mkAP very_AdA old_A) -very very old -แก่มากมาก - -mkAP or_Conj (mkAP old_A) (mkAP young_A) -old or young -แก่หริอรุ่น - -mkAP and_Conj (mkListAP (mkAP old_A) (mkListAP (mkAP big_A) (mkAP warm_A))) -old , big and warm -แก่ ใหญ่และอุ่น -- comma - -mkAP (mkOrd old_A) -oldest -แก่ที่สุด - -mkAP as_CAdv (mkAP old_A) she_NP -as old as she -แก่เท่าหล่อน - -mkUtt (reflAP married_A2) -married to itself -แต่งงานแล้วกับตัวเอง - -mkUtt (comparAP warm_A) -warmer -อุ่นกว่า - -mkAdv warm_A -warmly -อุ่น - -mkAdv in_Prep (mkNP the_Det house_N) -in the house -ในบ้าน - -mkAdv when_Subj (mkS (mkCl she_NP sleep_V)) -when she sleeps -เมื่อหล่อนนอนหลับ -- when !!?? - -mkAdv more_CAdv warm_A he_NP -more warmly than he -อบอุ่นมากกว่าเขา -- mak kwa - -mkAdv more_CAdv warm_A (mkS (mkCl he_NP run_V)) -more warmly than he runs -อบอุ่นมากกว่าที่เขาวิ่ง - -mkAdv very_AdA (mkAdv warm_A) -very warmly -อุ่นมาก - -mkAdv and_Conj here_Adv now_Adv -here and now -ที่นี่และเดี๋ยวนี้ -- tone mark - -mkAdv and_Conj (mkListAdv (mkAdv with_Prep she_NP) (mkListAdv here_Adv now_Adv)) -with her , here and now -กับหล่อน ที่นี่และเดี๋ยวนี้ -- comma, tone - -mkQS conditionalTense anteriorAnt negativePol (mkQCl who_IP sleep_V) -who wouldn't have slept -ใครนอนไม่หลับ -- ay - -mkQS (mkCl she_NP sleep_V) -does she sleep -หล่อนนอนหลับไหม - -mkQCl (mkCl she_NP sleep_V) -does she sleep -หล่อนนอนหลับไหม - -mkQCl who_IP (mkVP (mkVP sleep_V) here_Adv) -who sleeps here -ใครนอนหลับที่นี่ -- ay - -mkQCl who_IP sleep_V -who sleeps -ใครนอนหลับ -- ay - -mkQCl who_IP love_V2 she_NP -who loves her -ใครรักหล่อน -- ay - -mkQCl who_IP send_V3 it_NP she_NP -who sends it to her -ใครส่งมันให้กับหล่อน -- ay, send - -mkQCl who_IP want_VV (mkVP sleep_V) -who wants to sleep -ใครอยากนอนหลับ -- ay - -mkQCl who_IP say_VS (mkS (mkCl i_NP sleep_V)) -who says that I sleep -ใครบอกว่าฉันนอนหลับ -- ay, say (also phut) - -mkQCl who_IP wonder_VQ (mkQS (mkQCl who_IP sleep_V)) -who wonders who sleeps -ใครประหลาดใจว่าใครนอนหลับ -- ay ay - -mkQCl who_IP become_VA old_A -who becomes old -ใครกลายเป็นคนแก่ -- ay, classifier !! - -mkQCl who_IP become_VA (mkAP very_AdA old_A) -who becomes very old -ใครกลายเป็นคนแก่มาก -- ay, classifier - -mkQCl who_IP paint_V2A it_NP red_A -who paints it red -ใครทามันสีแดง -- ay, compl - -mkQCl who_IP paint_V2A it_NP (mkAP very_AdA red_A) -who paints it very red -ใครทามันสีแดงมาก -- ay, compl - -mkQCl who_IP answer_V2S he_NP (mkS (mkCl we_NP sleep_V)) -who answers to him that we sleep -ใครตอบเขาว่าเรานอนหลับ -- ay, compl - -mkQCl who_IP ask_V2Q he_NP (mkQS (mkQCl who_IP sleep_V)) -who asks him who sleeps -- ay, compl -ใครถามเขาว่าใครนอนหลับ - -mkQCl who_IP beg_V2V he_NP (mkVP sleep_V) -who begs him to sleep -ใครขอให้เขานอนหลับ -- beg; possible withour hay - -mkQCl who_IP old_A -who is old -ใครแก่ -- ay - -mkQCl who_IP old_A he_NP -who is older than he -ใครแก่กว่าเขา - -mkQCl who_IP married_A2 he_NP -who is married to him -ใครแต่งงานแล้วกับเขา - -mkQCl who_IP (mkAP very_AdA old_A) -who is very old -ใครแก่มาก - -mkQCl who_IP (mkNP the_Det woman_N) -who is the woman -ใครเป็นหญิง -- possible with pu - -mkQCl who_IP woman_N -who is a woman -ใครเป็นหญิง - -mkQCl who_IP (mkCN old_A woman_N) -who is an old woman -ใครเป็นหญิงแก่ - -mkQCl who_IP here_Adv -who is here -ใครอยู่ที่นี่ - -mkQCl who_IP (mkVP always_AdV (mkVP sleep_V)) -who always sleeps -ใครนอนหลับเสมอ -- always - -mkQCl why_IAdv (mkCl she_NP sleep_V) -why does she sleep -หล่อนนอนหลับทำไม -- th - -mkQCl with_Prep who_IP (mkCl she_NP sleep_V) -with whom does she sleep -หล่อนนอนหลับกับใคร -- ay - -mkQCl where_IAdv she_NP -where is she -หล่อนอยู่ที่ไหน - -mkQCl (mkIComp who_IP) (mkNP this_Det man_N) -who is this man -ชายคนนี้เป็นใคร - -mkQCl (mkIP which_IQuant city_N) -which city is there -คือเมืองเมืองไหน -- order !! - -mkQCl who_IP she_NP -who is her -หล่อนเป็นใคร -- order - -mkQCl who_IP (mkClSlash (mkClSlash she_NP love_V2) today_Adv) -whom does she love today -หล่อนรักใครวันนี้ -- adv last or first !! - -mkIP (mkIDet which_IQuant (mkNum (mkNumeral n5_Unit))) (mkCN big_A city_N) -which five big cities -เมืองใหญ่ห้าเมืองไหน - -mkIP (mkIDet which_IQuant (mkNum (mkNumeral n5_Unit))) city_N -which five cities -เมืองห้าเมืองไหน - -mkIP (mkIDet which_IQuant (mkNum (mkNumeral n5_Unit))) -which five -ห้าไหน - -mkIP which_IQuant (mkCN big_A city_N) -which big city -เมืองใหญ่เมืองไหน - -mkIP which_IQuant (mkNum (mkNumeral n5_Unit)) (mkCN big_A city_N) -which five big cities -เมืองใหญ่ห้าเมืองไหน - -mkIP which_IQuant city_N -which city -เมืองเมืองไหน - -mkIP who_IP (mkAdv in_Prep (mkNP paris_PN)) -who in Paris -ใครในปารีส - -mkUtt what_IP -what -อะไร - -mkUtt who_IP -who -ใคร - -mkIAdv in_Prep (mkIP which_IQuant city_N) -in which city -ในเมืองเมืองไหน - -mkIAdv where_IAdv (mkAdv in_Prep (mkNP paris_PN)) -where in Paris -ที่ไหนในปารีส - -mkIP (mkIDet which_IQuant pluralNum) house_N -which houses -บ้านหลังไหน - -mkIP (mkIDet which_IQuant) house_N -which house -บ้านหลังไหน - -mkIP which_IDet house_N -which house -บ้านหลังไหน - -mkIP whichPl_IDet house_N -which houses -บ้านหลังไหน - -mkCN woman_N (mkRS conditionalTense anteriorAnt negativePol (mkRCl which_RP sleep_V)) -woman who wouldn't have slept -หญิงที่นอนไม่หลับ - -mkCN woman_N (mkRS (mkRCl which_RP sleep_V)) -woman who sleeps -หญิงที่นอนหลับ - -mkCN woman_N (mkRS or_Conj (mkRS (mkRCl which_RP sleep_V)) (mkRS (mkRCl which_RP we_NP love_V2))) -woman who sleeps or whom we love -หญิงที่นอนหลับหริอที่เรารัก - -mkCN woman_N (mkRS (mkRCl which_RP (mkVP (mkVP sleep_V) here_Adv))) -woman who sleeps here -หญิงที่นอนหลับที่นี่ - -mkCN woman_N (mkRS (mkRCl which_RP sleep_V)) -woman who sleeps -หญิงที่นอนหลับ - -mkCN woman_N (mkRS (mkRCl which_RP love_V2 he_NP)) -woman who loves him -หญิงที่รักเขา - -mkCN woman_N (mkRS (mkRCl which_RP send_V3 it_NP he_NP)) -woman who sends it to him -หญิงที่ส่งมันให้กับเขา -- send - -mkCN woman_N (mkRS (mkRCl which_RP want_VV (mkVP sleep_V))) -woman who wants to sleep -หญิงที่อยากนอนหลับ - -mkCN woman_N (mkRS (mkRCl which_RP say_VS (mkS (mkCl i_NP sleep_V)))) -woman who says that I sleep -หญิงที่บอกว่าฉันนอนหลับ -- put better than bok? - -mkCN woman_N (mkRS (mkRCl which_RP wonder_VQ (mkQS (mkQCl who_IP sleep_V)))) -woman who wonders who sleeps -หญิงที่ประหลาดใจว่าใครนอนหลับ - -mkCN woman_N (mkRS (mkRCl which_RP become_VA old_A)) -woman who becomes old -หญิงที่กลายเป็นคนแก่ -- classifier !! - -mkCN woman_N (mkRS (mkRCl which_RP become_VA (mkAP very_AdA old_A))) -woman who becomes very old -หญิงที่กลายเป็นคนแก่มาก -- classifier - -mkCN woman_N (mkRS (mkRCl which_RP paint_V2A it_NP red_A)) -woman who paints it red -หญิงที่ทามันสีแดง - -mkCN woman_N (mkRS (mkRCl which_RP paint_V2A it_NP (mkAP very_AdA red_A))) -woman who paints it very red -หญิงที่ทามันสีแดงมาก - -mkCN woman_N (mkRS (mkRCl which_RP answer_V2S he_NP (mkS (mkCl we_NP sleep_V)))) -woman who answers to him that we sleep -หญิงที่ตอบเขาว่าเรานอนหลับ -- compl - -mkCN woman_N (mkRS (mkRCl which_RP ask_V2Q he_NP (mkQS (mkQCl who_IP sleep_V)))) -woman who asks him who sleeps -หญิงที่ถามเขาว่าใครนอนหลับ -- ay - -mkCN woman_N (mkRS (mkRCl which_RP beg_V2V he_NP (mkVP sleep_V))) -woman who begs him to sleep -หญิงที่ขอให้เขานอนหลับ -- beg - -mkCN woman_N (mkRS (mkRCl which_RP old_A)) -woman who is old -หญิงที่แก่ - -mkCN woman_N (mkRS (mkRCl which_RP old_A he_NP)) -woman who is older than he -หญิงที่แก่กว่าเขา - -mkCN woman_N (mkRS (mkRCl which_RP married_A2 he_NP)) -woman who is married to him -หญิงที่แต่งงานแล้วกับเขา - -mkCN woman_N (mkRS (mkRCl which_RP (mkAP very_AdA old_A))) -woman who is very old -หญิงที่แก่มาก - -mkCN woman_N (mkRS (mkRCl which_RP (mkNP the_Det woman_N))) -woman who is the woman -หญิงที่เป็นหญิง - -mkCN student_N (mkRS (mkRCl which_RP woman_N)) -student who is a woman -นักศึกษาที่เป็นหญิง - -mkCN student_N (mkRS (mkRCl which_RP (mkCN old_A woman_N))) -student who is an old woman -นักศึกษาที่เป็นหญิงแก่ - -mkCN woman_N (mkRS (mkRCl which_RP here_Adv)) -woman who is here -หญิงที่อยู่ที่นี่ - -mkCN woman_N (mkRS (mkRCl which_RP (mkVP always_AdV (mkVP sleep_V)))) -woman who always sleeps -หญิงที่นอนหลับเสมอ -- always - -mkCN woman_N (mkRS (mkRCl which_RP we_NP love_V2)) -woman whom we love -หญิงที่เรารัก - -mkCN woman_N (mkRS (mkRCl which_RP (mkClSlash (mkClSlash she_NP love_V2) today_Adv))) -woman whom she loves today -หญิงที่หล่อนรักวันนี้ - -mkRP in_Prep (mkNP all_Predet (mkNP the_Quant pluralNum city_N)) which_RP -all the cities in whom -ทุกเมืองในที่ -- or เมืองทั้งหมดในที่ - -mkSSlash (mkTemp pastTense anteriorAnt) negativePol (mkClSlash she_NP (mkVPSlash see_V2)) -she hadn't seen -หล่อนไม่เห็น - -mkQCl who_IP (mkClSlash she_NP (mkVPSlash see_V2)) -whom does she see -หล่อนเห็นใคร - -mkQCl who_IP (mkClSlash she_NP see_V2) -whom does she see -หล่อนเห็นใคร - -mkQCl who_IP (mkClSlash she_NP want_VV see_V2) -whom does she want to see -หล่อนอยากเห็นใคร - -mkQCl who_IP (mkClSlash (mkCl she_NP sleep_V) with_Prep) -with whom does she sleep -หล่อนนอนหลับกับใคร - -mkQCl who_IP (mkClSlash (mkClSlash she_NP see_V2) today_Adv) -whom does she see today -หล่อนเห็นใครวันนี้ - -mkQCl who_IP (mkClSlash she_NP know_VS (mkSSlash (mkTemp pastTense anteriorAnt) negativePol (mkClSlash we_NP (mkVPSlash see_V2)))) -whom does she know that we hadn't seen -ใครที่หล่อนรู้ว่าเราไม่เห็น -- khray first !! - -mkQCl who_IP (mkClSlash she_NP (mkVPSlash see_V2)) -whom does she see -หล่อนเห็นใคร -- or khray thi lon hin !! - -mkQCl who_IP (mkClSlash she_NP (mkVPSlash send_V3 it_NP)) -to whom does she send it -หล่อนส่งมันให้ใคร -- send - -mkQCl who_IP (mkClSlash she_NP (mkVPSlash paint_V2A (mkAP red_A))) -whom does she paint red -ใครที่หล่อนทาสีแดง - -mkQCl who_IP (mkClSlash she_NP (mkVPSlash ask_V2Q (mkQS (mkQCl where_IAdv (mkCl i_NP sleep_V))))) -whom does she ask where I sleep -หล่อนถามใครว่าฉันนอนหลับที่ไหน -- order !! - -mkQCl who_IP (mkClSlash she_NP (mkVPSlash answer_V2S (mkS (mkCl i_NP sleep_V)))) -to whom does she answer that I sleep -หล่อนตอบใครว่าฉันนอนหลับ -- compl - -mkQCl who_IP (mkClSlash she_NP (mkVPSlash beg_V2V (mkVP sleep_V))) -whom does she beg to sleep -หล่อนขอนอนหลับกับใคร - -mkQCl who_IP (mkClSlash she_NP (mkVPSlash want_VV (mkVPSlash see_V2))) -whom does she want to see -หล่อนอยากเห็นใคร - -mkQCl who_IP (mkClSlash she_NP (mkVPSlash beg_V2V i_NP (mkVPSlash see_V2))) -whom does she beg me to see -หล่อนขอให้ฉันเห็นใคร -- beg - -mkAdv above_Prep it_NP -above it -ข้างบนมัน - -mkAdv after_Prep it_NP -after it -หลังจากมัน -- lang cak - -mkUtt (mkNP all_Predet (mkNP thePl_Det man_N)) -all the men -ชายทั้งหมด -- order !! - -mkAP almost_AdA red_A -almost red -เกิอบสีแดง -- order !! - -mkCard almost_AdN (mkCard (mkNumeral n8_Unit)) -almost eight -เกิอบแปด -- order !! - -mkAdv although_Subj (mkS (mkCl she_NP sleep_V)) -although she sleeps -ถืงหล่อนนอนหลับ - -always_AdV -always -เสมอ -- always! - -mkAdv and_Conj here_Adv now_Adv -here and now -ที่นี่และเดี๋ยวนี้ -- tone - -mkAdv because_Subj (mkS (mkCl she_NP sleep_V)) -because she sleeps -เพราะหล่อนนอนหลับ - -mkAdv before_Prep it_NP -before it -ก่อนมัน - -mkAdv behind_Prep it_NP -behind it -หลังมัน - -mkAdv between_Prep (mkNP and_Conj you_NP i_NP) -between you and me -ระหว่างคุณและฉัน -- place of tone mark - -mkAdv both7and_DConj here_Adv there_Adv -both here and there -ทั้งที่นี่และที่นั่น -- both - -but_PConj -but -แต่ - -mkAdv by8agent_Prep it_NP -by it -มัน - -mkAdv by8means_Prep it_NP -by it -ผ่านมัน - -mkUtt (mkVP can8know_VV (mkVP sleep_V)) -to be able to sleep -นอนหลับได้ - -mkUtt (mkVP can_VV (mkVP sleep_V)) -to be able to sleep -นอนหลับได้ - -mkAdv during_Prep it_NP -during it -ที่มัน -- ra wang man (= between) - -mkAdv either7or_DConj here_Adv there_Adv -either here or there -ที่นี่หริอที่นั่น -- or: mai - ko - -mkUtt (mkNP every_Det woman_N) -every woman -หญิงทุกคน - -mkUtt everybody_NP -everybody -ทุกคน -- every person - -mkUtt everything_NP -everything -ทุกสิ่ง - -everywhere_Adv -everywhere -ทุกที่ - -mkAdv for_Prep it_NP -for it -ให้มัน - -mkAdv from_Prep it_NP -from it -จากมัน -- a: - -mkUtt (mkNP he_Pron) -he -เขา - -here_Adv -here -ที่นี่ - -here7to_Adv -to here -ที่นี่ - -here7from_Adv -from here -จากนี่ -- a: - -mkUtt how_IAdv -how -อย่างไร - -mkUtt (mkIP how8many_IDet house_N) -how many houses -บ้านกี่หลัง - -mkUtt how8much_IAdv -how much -เท่าไร - -mkUtt (mkNP i_Pron) -I -ฉัน - -mkAdv if_Subj (mkS (mkCl she_NP sleep_V)) -if she sleeps -ถ้าหล่อนนอนหลับ - -mkAdv in8front_Prep it_NP -in front of it -หน้ามัน - -mkAdv in_Prep it_NP -in it -ในมัน - -mkUtt (mkNP it_Pron) -it -มัน - -less_CAdv -less -น้อย - -mkUtt (mkNP many_Det house_N) -many houses -บ้านหลายหลัง - -more_CAdv -more -มากกว่า -- more than - -most_Predet -most -มากที่สุด - -mkUtt (mkNP much_Det wine_N) -much wine -เหล้าองุ่นหลายขวด -- the word much - -must_VV -have to -ต้อง - -no_Utt -no -ไม่ - -mkAdv on_Prep it_NP -on it -บนมัน - -only_Predet -only -เท่านั้น -- tone mark - -mkAdv or_Conj here_Adv there_Adv -here or there -ที่นี่หริอที่นั่น - -otherwise_PConj -otherwise -ไม่อย่างนั้น -- was very informal - -mkAdv part_Prep it_NP -of it -มัน - -please_Voc -please -ขอ - -mkAdv possess_Prep it_NP -of it -ของมัน - -quite_Adv -quite -ค่อนข้าง -- tone mark - -mkUtt (mkNP she_Pron) -she -หล่อน - -so_AdA -so -ดังนั้น -- (maak, if alone) !! - -mkUtt (mkNP someSg_Det wine_N) -some wine -เหล้าองุ่นบางขวด -- tone mark in wrong order; second tone mark off - -mkUtt (mkNP somePl_Det woman_N) -some women -หญิงบางคน -- tone mark - -mkUtt somebody_NP -somebody -บางคน - -mkUtt something_NP -something -บางสิ่ง - -somewhere_Adv -somewhere -บางแห่ง - -mkUtt (mkNP that_Quant house_N) -that house -บ้านหลังนั้น - -mkAdv that_Subj (mkS (mkCl she_NP sleep_V)) -that she sleeps -ว่าหล่อนนอนหลับ - -there_Adv -there -ที่นั่น - -there7to_Adv -there -ที่นั่น - -there7from_Adv -from there -จากนั่น -- a: - -therefore_PConj -therefore -เพราะฉะนั้น - -mkUtt (mkNP they_Pron) -they -เขา - -mkUtt (mkNP this_Quant house_N) -this house -บ้านหลังนี้ - -mkAdv through_Prep it_NP -through it -ผ่านมัน - -mkAdv to_Prep it_NP -to it -ถืงมัน - -too_AdA -too -เกินไป - -mkAdv under_Prep it_NP -under it -ใต้มัน - -very_AdA -very -มาก - -want_VV -want -อยาก - -mkUtt (mkNP we_Pron) -we -เรา - -whatPl_IP -what -อะไร - -whatSg_IP -what -อะไร - -mkUtt when_IAdv -when -เมื่อไร - -mkAdv when_Subj (mkS (mkCl she_NP sleep_V)) -when she sleeps -ที่หล่อนนอนหลับ -- tone mark, th !! tone mark in Subj? - -mkUtt where_IAdv -where -ที่ไหน - -mkIP which_IQuant house_N -which house -บ้านหลังไหน - -whoPl_IP -who -ใคร - -whoSg_IP -who -ใคร - -mkUtt why_IAdv -why -ทำไม -- th - -mkAdv with_Prep it_NP -with it -กับมัน - -mkAdv without_Prep it_NP -without it -ไม่มีมัน -- tone mark - -yes_Utt -yes -ใช่ - -mkUtt (mkNP youSg_Pron) -you -คุณ - -mkUtt (mkNP youPl_Pron) -you -คุณ - -mkUtt (mkNP youPol_Pron) -you -คุณ - -mkUtt (mkNP no_Quant house_N) -no house -ไม่มีบ้าน -- may mi + N + C - -mkUtt (mkNP not_Predet everybody_NP) -not everybody -ไม่ทุกคน - -mkAdv if_then_Conj here_Adv there_Adv -if here then there -ถ้าที่นี่ก็ที่นั่น - -mkCard at_least_AdN (mkCard (mkNumeral n8_Unit)) -at least eight -อย่างน้อยแปด -- order - -mkCard at_most_AdN (mkCard (mkNumeral n8_Unit)) -at most eight -อย่างมากแปด -- order - -mkUtt nobody_NP -nobody -ไม่มีใคร - -mkUtt nothing_NP -nothing -เปล่า - -mkAdv except_Prep it_NP -except it -นอกจากมัน - -as_CAdv -as -เท่า - -mkUtt (mkVP have_V2 it_NP) -to have it -มีมัน - diff --git a/doc/Makefile b/doc/Makefile index 249095a9d..bb2880692 100644 --- a/doc/Makefile +++ b/doc/Makefile @@ -1,66 +1,18 @@ -.PHONY: all index status synopsis abstract +.PHONY: all status synopsis abstract + +GFDOC=gfdoc +S=../src all: synopsis -GF_alltenses=$(GF_LIB_PATH)/alltenses -GF=gf -GFDOC=gfdoc +status: status.html -index: - txt2tags -thtml index.txt -status: +synopsis: + make -C synopsis + +status.html: txt2tags -thtml status.txt -synopsis: synopsis.html - -S=../src - -# List of languages extracted from languages.csv, with 'Synopsis' column == y -LANGS=$(shell cat ../languages.csv | cut -d',' -f1,10 | grep ',y' | cut -d',' -f1) - -# This list was constructed by observing what files MkSynopsis.hs reads -SRC_FILES=$(S)/abstract/Common.gf $(S)/abstract/Cat.gf $(S)/api/Constructors.gf $(S)/abstract/Structural.gf $(patsubst %,$S/*/Paradigms%.gf,$(LANGS)) - -EXAMPLES_OUT=$(patsubst %,api-examples-%.txt,$(LANGS)) -INCLUDES=synopsis-intro.txt categories-intro.txt categories-imagemap.html synopsis-additional.txt synopsis-browse.txt synopsis-example.txt - -synopsis.txt: MkSynopsis.hs MkExxTable.hs $(INCLUDES) $(EXAMPLES_OUT) $(SRC_FILES) - runghc -i.. MkSynopsis.hs - -TMP=tmp.html -synopsis.html: synopsis.txt _template.html - txt2tags --target=html --no-headers --quiet --toc --outfile=$@ --infile=$< - pandoc \ - --from=html \ - --to=html5 \ - --standalone \ - --template=_template.html \ - --css=synopsis.css \ - --metadata='title:"GF Resource Grammar Library: Synopsis"' \ - --variable='rel-root:../..' \ - --output=$(TMP) \ - $@ - mv $(TMP) $@ - -categories.png: categories.dot - dot -Tpng $^ > $@ - -categories-imagemap.html: categories.dot - dot -Tcmapx $^ > $@ - abstract: $(GFDOC) -txthtml $S/abstract/*.gf mv $S/abstract/*.html abstract - -api-examples.gfs: api-examples.txt MkExx.hs - runghc MkExx.hs < $< > $@ - -# Since .gfo files aren't self-contained, the dependencies given here are -# incomplete. But I am thinking that the Try%.gfo file will always be newer -# than any other files it depends on, so the rule will trigger when -# needed anyway. //TH 2018-10-22 -api-examples-%.txt: $(GF_alltenses)/Try%.gfo api-examples.gfs - GF_LIB_PATH=$(GF_LIB_PATH) $(GF) -retain -s $< $@ - -clean: - rm -rf synopsis.txt api-examples.gfs $(EXAMPLES_OUT) diff --git a/doc/Test.hs b/doc/Test.hs deleted file mode 100644 index 25165cb0a..000000000 --- a/doc/Test.hs +++ /dev/null @@ -1,22 +0,0 @@ -import qualified Data.Map as Map -import Data.Char - -gold = "CC_eng_tha.txt" -tested = "api-examples-Tha.txt" - -main = do - s <- readFile gold - let corrects = Map.fromList $ exx 1 5 2 (lines s) --- mapM_ putStrLn $ concat [[t,s] | (t,s) <- Map.toList corrects] - t <- readFile tested - mapM_ (doTest corrects) (exx 18 22 1 (map (drop 4) (lines t))) - -exx x y z ss = [(ss!!k,ss!!(k+z)) | k <- [x,y .. length ss - 2]] - -doTest corrects (t,s) = case Map.lookup t corrects of - Just c -> if unspace s == uncomment c then return () else mapM_ putStrLn [t,unspace s,c] - _ -> return () - -unspace = filter (not . isSpace) -uncomment = unspace . takeWhile (/= '-') - diff --git a/doc/categories.png b/doc/categories.png deleted file mode 100644 index 590540fc5..000000000 Binary files a/doc/categories.png and /dev/null differ diff --git a/doc/editor.png b/doc/editor.png deleted file mode 100644 index 63a3161bf..000000000 Binary files a/doc/editor.png and /dev/null differ diff --git a/doc/index.txt b/doc/index.txt deleted file mode 100644 index 86d1f3d0c..000000000 --- a/doc/index.txt +++ /dev/null @@ -1,267 +0,0 @@ -GF Resource Grammar Library v. 1.2 -Author: Aarne Ranta -Last update: %%date(%c) - -% NOTE: this is a txt2tags file. -% Create an html file from this file using: -% txt2tags --toc -thtml index.txt - -%!target:html - -%!postproc(html): #BCEN
-%!postproc(html): #ECEN
- - -#BCEN - -[10lang-large.png] - -#ECEN - - -The GF Resource Grammar Library defines the basic grammar of -ten languages: -Danish, English, Finnish, French, German, -Italian, Norwegian, Russian, Spanish, Swedish. -Still incomplete implementations for Arabic and Catalan are also -included. - -**New** in December 2007: Browsing the library by syntax editor -[directly on the web ../../../demos/resource-api/editor.html]. - - - - -==Authors== - -Inger Andersson and Therese Soderberg (Spanish morphology), -Nicolas Barth and Sylvain Pogodalla (French verb list), -Ali El Dada (Arabic modules), -Magda Gerritsen and Ulrich Real (Russian paradigms and lexicon), -Janna Khegai (Russian modules), -Bjorn Bringert (many Swadesh lexica), -Carlos Gonzala (Spanish cardinals), -Harald Hammarstrm (German morphology), -Patrik Jansson (Swedish cardinals), -Andreas Priesnitz (German lexicon), -Aarne Ranta, -Jordi Saludes (Catalan modules), -Henning Thielemann (German lexicon). - - -We are grateful for contributions and -comments to several other people who have used this and -the previous versions of the resource library, including -Ludmilla Bogavac, -Ana Bove, -David Burke, -Lauri Carlson, -Gloria Casanellas, -Karin Cavallin, -Robin Cooper, -Hans-Joachim Daniels, -Elisabet Engdahl, -Markus Forsberg, -Kristofer Johannisson, -Anni Laine, -Hans Lei, -Peter Ljunglf, -Saara Myllyntausta, -Wanjiku Ng'ang'a, -Nadine Perera, -Jordi Saludes. - - -==License== - -The GF Resource Grammar Library is open-source software licensed under -GNU Lesser General Public License (LGPL). See the file [LICENSE ../LICENSE] for more -details. - - -==Scope== - -Coverage, for each language: -- complete morphology -- lexicon of the ca. 100 most important structural words -- test lexicon of ca. 300 content words (rough equivalents in each language) -- list of irregular verbs (separately for each language) -- representative fragment of syntax (cf. CLE (Core Language Engine)) -- rather flat semantics (cf. Quasi-Logical Form of CLE) - - -Organization: -- top-level (API) modules -- Ground API + special-purpose APIs -- "school grammar" concepts rather than advanced linguistic theory - - -Presentation: -- tool ``gfdoc`` for generating HTML from grammars -- example collections - - -==Location== - -Assuming you have installed the libraries, you will find the precompiled -``gfc`` and ``gfr`` files directly under ``$GF_LIB_PATH``, whose default -value is ``/usr/local/share/GF/``. The precompiled subdirectories are -``` - alltenses - mathematical - multimodal - present -``` -Do for instance -``` - cd $GF_LIB_PATH - gf alltenses/langs.gfcm - - > p -cat=S -lang=LangEng "this grammar is too big" | tb -``` -For more details, see the [Synopsis synopsis.html]. - - -==Compilation== - -If you want to compile the library from scratch, use ``make`` in the root of -the source directory: -``` - cd GF/lib/resource-1.0 - make -``` -The ``make`` procedure does not by default make Arabic and Catalan, but you -can uncomment the relevant lines in ``Makefile`` to compile them. - - -==Encoding== - -Finnish, German, Romance, and Scandinavian languages are in isolatin-1. - -Arabic and Russian are in UTF-8. - -English is in pure ASCII. - -The different encodings imply, unfortunately, that it is hard to get -a nice view of all languages simultaneously. The easiest way to achieve this is -to use ``gfeditor``, which automatically converts grammars to UTF-8. - - -==Using the resource as library== - -This API is accessible by both ``present`` and ``alltenses``. The modules you most often need are -- ``Syntax``, the interface to syntactic structures -- ``Syntax``//L//, the implementations of ``Syntax`` for each language //L// -- ``Paradigms``//L//, the morphological paradigms for each language //L// - - -The [Synopsis synopsis.html] gives examples on the typical usage of these -modules. - - -==Using the resource as top level grammar== - -The following modules can be used for parsing and linearization. They are accessible from both -``present`` and ``alltenses``. -- ``Lang``//L// for each language //L//, implementing a common abstract syntax ``Lang`` -- ``Danish``, ``English``, etc, implementing ``Lang`` with language-specific extensions - - -In addition, there is in both ``present`` and ``alltenses`` the file -- ``langs.gfcm``, a package with precompiled ``Lang``//L// grammars - - -A way to test and view the resource grammar is to load ``langs.gfcm`` either into ``gfeditor`` -or into the ``gf`` shell and perform actions such as syntax editing and treebank generation. -For instance, the command -``` - > p -lang=LangEng -cat=S "this grammar is too big" | tb -``` -creates a treebank entry with translations of this sentence. - -For parsing, currently only English and the Scandinavian languages are within the limits ofr -reasonable resources. For other languages //L//, parsing with ``Lang``//L// will probably eat -up the computer resources before finishing the parser generation. - - - -==Accessing the lower level ground API== - -The ``Syntax`` API is implemented in terms a bunch of ``abstract`` modules, which -as of version 1.2 are mainly interesting for implementors of the resource. -See the [documentation for version 1.1 index-1.1.html] for more details. - - -==Known bugs and missing components== - -Danish -- the lexicon and chosen inflections are only partially verified - - -English - - -Finnish -- wrong cases in some passive constructions - - -French -- multiple clitics (with V3) not always right -- third person pronominal questions with inverted word order - have wrong forms if "t" is required e.g. - (e.g. "comment fera-t-il" becomes "comment fera il") - - -German - - -Italian -- multiple clitics (with V3) not always right - - -Norwegian -- the lexicon and chosen inflections are only partially verified - - -Russian -- some functions missing -- some regular paradigms are missing - - -Spanish -- multiple clitics (with V3) not always right -- missing contractions with imperatives and clitics - - -Swedish - - - - -==More reading== - -[Synopsis synopsis.html]. The concise guide to API v. 1.2. - -[Grammars as Software Libraries gslt-sem-2006.html]. Slides -with background and motivation for the resource grammar library. - -[GF Resource Grammar Library Version 1.0 clt2006.html]. Slides -giving an overview of the library and practical hints on its use. - -[How to write resource grammars Resource-HOWTO.html]. Helps you -start if you want to add another language to the library. - -[Parametrized modules for Romance languages http://www.cs.chalmers.se/~aarne/geocal2006.pdf]. -Slides explaining some ideas in the implementation of -French, Italian, and Spanish. - -[Grammar writing by examples http://www.cs.chalmers.se/~aarne/slides/webalt-2005.pdf]. -Slides showing how linearization rules are written as strings parsable by the resource grammar. - -[Multimodal Resource Grammars http://www.cs.chalmers.se/~aarne/slides/talk-edin2005.pdf]. -Slides showing how to use the multimodal resource library. N.B. the library -examples are from ``multimodal/old``, which is a reduced-size API. - -[GF Resource Grammar Library ../../../doc/resource.pdf] (pdf). -Printable user manual with API documentation, for version 1.0. - diff --git a/doc/official.txt b/doc/official.txt deleted file mode 100644 index 1216226e2..000000000 --- a/doc/official.txt +++ /dev/null @@ -1,581 +0,0 @@ -The Official EU languages - -The 20 official languages of the EU and their abbreviations are as follows: - -Espaol ES Spanish -Dansk DA Danish -Deutsch DE German -Elinika EL Greek -English EN -Franais FR French -Italiano IT Italian -Nederlands NL Dutch -Portugus PT Portuguese -Suomi FI Finnish -Svenska SV Swedish -?e?tina CS Czech -Eesti ET Estonian -Latviesu valoda LV Latvian -Lietuviu kalba LT Lithuanian -Magyar HU Hungarian -Malti MT Maltese -Polski PL Polish -Sloven?ina SK Slovak -Sloven??ina SL Slovene - -http://europa.eu.int/comm/education/policies/lang/languages/index_en.html - ------ -http://www.w3.org/WAI/ER/IG/ert/iso639.htm - -ar arabic -no norwegian -ru russian - --- - -ISO 639: 3-letter codes - -abk ab Abkhazian -ace Achinese -ach Acoli -ada Adangme -aar aa Afar -afh Afrihili -afr af Afrikaans -afa Afro-Asiatic (Other) -aka Akan -akk Akkadian -alb/sqi sq Albanian -ale Aleut -alg Algonquian languages -tut Altaic (Other) -amh am Amharic -apa Apache languages -ara ar Arabic -arc Aramaic -arp Arapaho -arn Araucanian -arw Arawak -arm/hye hy Armenian -art Artificial (Other) -asm as Assamese -ath Athapascan languages -map Austronesian (Other) -ava Avaric -ave Avestan -awa Awadhi -aym ay Aymara -aze az Azerbaijani -nah Aztec -ban Balinese -bat Baltic (Other) -bal Baluchi -bam Bambara -bai Bamileke languages -bad Banda -bnt Bantu (Other) -bas Basa -bak ba Bashkir -baq/eus eu Basque -bej Beja -bem Bemba -ben bn Bengali -ber Berber (Other) -bho Bhojpuri -bih bh Bihari -bik Bikol -bin Bini -bis bi Bislama -bra Braj -bre be Breton -bug Buginese -bul bg Bulgarian -bua Buriat -bur/mya my Burmese -bel be Byelorussian -cad Caddo -car Carib -cat ca Catalan -cau Caucasian (Other) -ceb Cebuano -cel Celtic (Other) -cai Central American Indian (Other) -chg Chagatai -cha Chamorro -che Chechen -chr Cherokee -chy Cheyenne -chb Chibcha -chi/zho zh Chinese -chn Chinook jargon -cho Choctaw -chu Church Slavic -chv Chuvash -cop Coptic -cor Cornish -cos co Corsican -cre Cree -mus Creek -crp Creoles and Pidgins (Other) -cpe Creoles and Pidgins, English-based (Other) -cpf Creoles and Pidgins, French-based (Other) -cpp Creoles and Pidgins, Portuguese-based (Other) -cus Cushitic (Other) - hr Croatian -ces/cze cs Czech -dak Dakota -dan da Danish -del Delaware -din Dinka -div Divehi -doi Dogri -dra Dravidian (Other) -dua Duala -dut/nla nl Dutch -dum Dutch, Middle (ca. 1050-1350) -dyu Dyula -dzo dz Dzongkha -efi Efik -egy Egyptian (Ancient) -eka Ekajuk -elx Elamite -eng en English -enm English, Middle (ca. 1100-1500) -ang English, Old (ca. 450-1100) -esk Eskimo (Other) -epo eo Esperanto -est et Estonian -ewe Ewe -ewo Ewondo -fan Fang -fat Fanti -fao fo Faroese -fij fj Fijian -fin fi Finnish -fiu Finno-Ugrian (Other) -fon Fon -fra/fre fr French -frm French, Middle (ca. 1400-1600) -fro French, Old (842- ca. 1400) -fry fy Frisian -ful Fulah -gaa Ga -gae/gdh Gaelic (Scots) -glg gl Gallegan -lug Ganda -gay Gayo -gez Geez -geo/kat ka Georgian -deu/ger de German -gmh German, Middle High (ca. 1050-1500) -goh German, Old High (ca. 750-1050) -gem Germanic (Other) -gil Gilbertese -gon Gondi -got Gothic -grb Grebo -grc Greek, Ancient (to 1453) -ell/gre el Greek, Modern (1453-) -kal kl Greenlandic -grn gn Guarani -guj gu Gujarati -hai Haida -hau ha Hausa -haw Hawaiian -heb he Hebrew -her Herero -hil Hiligaynon -him Himachali -hin hi Hindi -hmo Hiri Motu -hun hu Hungarian -hup Hupa -iba Iban -ice/isl is Icelandic -ibo Igbo -ijo Ijo -ilo Iloko -inc Indic (Other) -ine Indo-European (Other) -ind id Indonesian -ina ia Interlingua (International Auxiliary language Association) -ine - Interlingue -iku iu Inuktitut -ipk ik Inupiak -ira Iranian (Other) -gai/iri ga Irish -sga Irish, Old (to 900) -mga Irish, Middle (900 - 1200) -iro Iroquoian languages -ita it Italian -jpn ja Japanese -jav/jaw jv/jw Javanese -jrb Judeo-Arabic -jpr Judeo-Persian -kab Kabyle -kac Kachin -kam Kamba -kan kn Kannada -kau Kanuri -kaa Kara-Kalpak -kar Karen -kas ks Kashmiri -kaw Kawi -kaz kk Kazakh -kha Khasi -khm km Khmer -khi Khoisan (Other) -kho Khotanese -kik Kikuyu -kin rw Kinyarwanda -kir ky Kirghiz -kom Komi -kon Kongo -kok Konkani -kor ko Korean -kpe Kpelle -kro Kru -kua Kuanyama -kum Kumyk -kur ku Kurdish -kru Kurukh -kus Kusaie -kut Kutenai -lad Ladino -lah Lahnda -lam Lamba -oci oc Langue d'Oc (post 1500) -lao lo Lao -lat la Latin -lav lv Latvian -ltz Letzeburgesch -lez Lezghian -lin ln Lingala -lit lt Lithuanian -loz Lozi -lub Luba-Katanga -lui Luiseno -lun Lunda -luo Luo (Kenya and Tanzania) -mac/mak mk Macedonian -mad Madurese -mag Magahi -mai Maithili -mak Makasar -mlg mg Malagasy -may/msa ms Malay -mal Malayalam -mlt ml Maltese -man Mandingo -mni Manipuri -mno Manobo languages -max Manx -mao/mri mi Maori -mar mr Marathi -chm Mari -mah Marshall -mwr Marwari -mas Masai -myn Mayan languages -men Mende -mic Micmac -min Minangkabau -mis Miscellaneous (Other) -moh Mohawk -mol mo Moldavian -mkh Mon-Kmer (Other) -lol Mongo -mon mn Mongolian -mos Mossi -mul Multiple languages -mun Munda languages -nau na Nauru -nav Navajo -nde Ndebele, North -nbl Ndebele, South -ndo Ndongo -nep ne Nepali -new Newari -nic Niger-Kordofanian (Other) -ssa Nilo-Saharan (Other) -niu Niuean -non Norse, Old -nai North American Indian (Other) -nor no Norwegian -nno Norwegian (Nynorsk) -nub Nubian languages -nym Nyamwezi -nya Nyanja -nyn Nyankole -nyo Nyoro -nzi Nzima -oji Ojibwa -ori or Oriya -orm om Oromo -osa Osage -oss Ossetic -oto Otomian languages -pal Pahlavi -pau Palauan -pli Pali -pam Pampanga -pag Pangasinan -pan pa Panjabi -pap Papiamento -paa Papuan-Australian (Other) -fas/per fa Persian -peo Persian, Old (ca 600 - 400 B.C.) -phn Phoenician -pol pl Polish -pon Ponape -por pt Portuguese -pra Prakrit languages -pro Provencal, Old (to 1500) -pus ps Pushto -que qu Quechua -roh rm Rhaeto-Romance -raj Rajasthani -rar Rarotongan -roa Romance (Other) -ron/rum ro Romanian -rom Romany -run rn Rundi -rus ru Russian -sal Salishan languages -sam Samaritan Aramaic -smi Sami languages -smo sm Samoan -sad Sandawe -sag sg Sango -san sa Sanskrit -srd Sardinian -sco Scots -sel Selkup -sem Semitic (Other) - sr Serbian -scr sh Serbo-Croatian -srr Serer -shn Shan -sna sn Shona -sid Sidamo -bla Siksika -snd sd Sindhi -sin si Singhalese -sit - Sino-Tibetan (Other) -sio Siouan languages -sla Slavic (Other) -ssw ss Siswant -slk/slo sk Slovak -slv sl Slovenian -sog Sogdian -som so Somali -son Songhai -wen Sorbian languages -nso Sotho, Northern -sot st Sotho, Southern -sai South American Indian (Other) -esl/spa es Spanish -suk Sukuma -sux Sumerian -sun su Sudanese -sus Susu -swa sw Swahili -ssw Swazi -sve/swe sv Swedish -syr Syriac -tgl tl Tagalog -tah Tahitian -tgk tg Tajik -tmh Tamashek -tam ta Tamil -tat tt Tatar -tel te Telugu -ter Tereno -tha th Thai -bod/tib bo Tibetan -tig Tigre -tir ti Tigrinya -tem Timne -tiv Tivi -tli Tlingit -tog to Tonga (Nyasa) -ton Tonga (Tonga Islands) -tru Truk -tsi Tsimshian -tso ts Tsonga -tsn tn Tswana -tum Tumbuka -tur tr Turkish -ota Turkish, Ottoman (1500 - 1928) -tuk tk Turkmen -tyv Tuvinian -twi tw Twi -uga Ugaritic -uig ug Uighur -ukr uk Ukrainian -umb Umbundu -und Undetermined -urd ur Urdu -uzb uz Uzbek -vai Vai -ven Venda -vie vi Vietnamese -vol vo Volapk -vot Votic -wak Wakashan languages -wal Walamo -war Waray -was Washo -cym/wel cy Welsh -wol wo Wolof -xho xh Xhosa -sah Yakut -yao Yao -yap Yap -yid yi Yiddish -yor yo Yoruba -zap Zapotec -zen Zenaga -zha za Zhuang -zul zu Zulu -zun Zuni - -ISO 639: 2-letter codes - -AA "Afar" -AB "Abkhazian" -AF "Afrikaans" -AM "Amharic" -AR "Arabic" -AS "Assamese" -AY "Aymara" -AZ "Azerbaijani" -BA "Bashkir" -BE "Byelorussian" -BG "Bulgarian" -BH "Bihari" -BI "Bislama" -BN "Bengali" "Bangla" -BO "Tibetan" -BR "Breton" -CA "Catalan" -CO "Corsican" -CS "Czech" -CY "Welsh" -DA "Danish" -DE "German" -DZ "Bhutani" -EL "Greek" -EN "English" "American" -EO "Esperanto" -ES "Spanish" -ET "Estonian" -EU "Basque" -FA "Persian" -FI "Finnish" -FJ "Fiji" -FO "Faeroese" -FR "French" -FY "Frisian" -GA "Irish" -GD "Gaelic" "Scots Gaelic" -GL "Galician" -GN "Guarani" -GU "Gujarati" -HA "Hausa" -HI "Hindi" -HR "Croatian" -HU "Hungarian" -HY "Armenian" -IA "Interlingua" -IE "Interlingue" -IK "Inupiak" -IN "Indonesian" -IS "Icelandic" -IT "Italian" -IW "Hebrew" -JA "Japanese" -JI "Yiddish" -JW "Javanese" -KA "Georgian" -KK "Kazakh" -KL "Greenlandic" -KM "Cambodian" -KN "Kannada" -KO "Korean" -KS "Kashmiri" -KU "Kurdish" -KY "Kirghiz" -LA "Latin" -LN "Lingala" -LO "Laothian" -LT "Lithuanian" -LV "Latvian" "Lettish" -MG "Malagasy" -MI "Maori" -MK "Macedonian" -ML "Malayalam" -MN "Mongolian" -MO "Moldavian" -MR "Marathi" -MS "Malay" -MT "Maltese" -MY "Burmese" -NA "Nauru" -NE "Nepali" -NL "Dutch" -NO "Norwegian" -OC "Occitan" -OM "Oromo" "Afan" -OR "Oriya" -PA "Punjabi" -PL "Polish" -PS "Pashto" "Pushto" -PT "Portuguese" -QU "Quechua" -RM "Rhaeto-Romance" -RN "Kirundi" -RO "Romanian" -RU "Russian" -RW "Kinyarwanda" -SA "Sanskrit" -SD "Sindhi" -SG "Sangro" -SH "Serbo-Croatian" -SI "Singhalese" -SK "Slovak" -SL "Slovenian" -SM "Samoan" -SN "Shona" -SO "Somali" -SQ "Albanian" -SR "Serbian" -SS "Siswati" -ST "Sesotho" -SU "Sudanese" -SV "Swedish" -SW "Swahili" -TA "Tamil" -TE "Tegulu" -TG "Tajik" -TH "Thai" -TI "Tigrinya" -TK "Turkmen" -TL "Tagalog" -TN "Setswana" -TO "Tonga" -TR "Turkish" -TS "Tsonga" -TT "Tatar" -TW "Twi" -UK "Ukrainian" -UR "Urdu" -UZ "Uzbek" -VI "Vietnamese" -VO "Volapuk" -WO "Wolof" -XH "Xhosa" -YO "Yoruba" -ZH "Chinese" -ZU "Zulu" diff --git a/doc/paradigms.txt b/doc/paradigms.txt deleted file mode 100644 index 0c4cf260c..000000000 --- a/doc/paradigms.txt +++ /dev/null @@ -1,48 +0,0 @@ -Morphological Paradigms in the GF Resource Grammar Library -Aarne Ranta - - -This is a synopsis of the main morphological paradigms for -nouns (``N``), adjectives (``A``), and verbs (``V``). - - -=English= - -``` - mkN : (flash : Str) -> N ; -- car, bus, ax, hero, fly, boy - mkN : (man,men : Str) -> N ; -- index, indices - mkN : (man,men,man's,men's : Str) -> N ; - mkN : Str -> N -> N ; -- baby boom - - mkA : (happy : Str) -> A ; -- small, happy, free - mkA : (fat,fatter : Str) -> A ; - mkA : (good,better,best,well : Str) -> A - compoundA : A -> A ; -- -/more/most ridiculous - - mkV : (cry : Str) -> V ; -- call, kiss, echo, cry, pray - mkV : (stop,stopped : Str) -> V ; - mkV : (drink,drank,drunk : Str) -> V ; - mkV : (run,ran,run,running : Str) -> V ; - mkV : (go,goes,went,gone,going : Str) -> V -``` - -=French= - -``` - mkN : (cheval : Str) -> N ; -- pas, prix, nez, bijou, cheval - mkN : (foie : Str) -> Gender -> N ; - mkN : (oeil,yeux : Str) -> Gender -> N ; - mkN : N -> Str -> N - - mkA : (cher : Str) -> A ; -- banal, heureux, italien, jeune, amer, carr, joli - mkA : (sec,seche : Str) -> A ; - mkA : (banal,banale,banaux,banalement : Str) -> A ; - mkA : (bon : A) -> (meilleur : A) -> A - prefixA : A -> A ; - - mkV : (finir : Str) -> V ; -- aimer, cder, placer, manger, payer, finir - mkV : (jeter,jette,jettera : Str) -> V ; - mkV : V2 -> V - etreV : V -> V ; - reflV : V -> V ; -``` diff --git a/doc/rgl-publications.html b/doc/rgl-publications.html deleted file mode 100644 index fdf576cb5..000000000 --- a/doc/rgl-publications.html +++ /dev/null @@ -1,529 +0,0 @@ - - - - - - -GF Resource Grammar Library Documentation and Publications - -
-

GF Resource Grammar Library Documentation and Publications

-Aarne Ranta
-20170119 -
- -

-To be completed. Contributions welcome - in particular, links to open access publications! -

- -

Afrikaans

- - - -

Amharic

- - - -

Arabic

- - - -

Bulgarian

- - - -

Catalan

- - - -

Chinese

- - - -

Danish

- - - -

Dutch

- - - -

English

- - - -

Estonian

- - - -

Finnish

- - - -

French

- - - -

German

- - - -

Greek

- - - -

Hebrew

- - - -

Hindi

- - - -

Icelandic

- - - -

Interlingua

- - - -

Italian

- - - -

Japanese

- - - -

Latin

- - - -

Latvian

- - - -

Maltese

- - - -

Nepali

- - - -

Norwegian (bokmål)

- - - -

Norwegian (nynorsk)

- - - -

Persian

- - - -

Polish

- - - -

Punjabi

- - - -

Romanian

- - - -

Russian

- - - -

Sindhi

- - - -

Spanish

- - - -

Swahili

- - - - - -

Swedish

- - - -

Thai

- - - -

Turkish

- - - -

Urdu

- - - - - - diff --git a/doc/status.html b/doc/status.html deleted file mode 100644 index 546385edf..000000000 --- a/doc/status.html +++ /dev/null @@ -1,862 +0,0 @@ - - - - - -The Status of the GF Resource Grammar Library - -
-

The Status of the GF Resource Grammar Library

-Aarne Ranta
-20170119 -
- -

-The following table gives the languages currently available in the -GF Resource Grammar Library. -

-

-For another view, see the -The Resource Grammar Library coverage map . -

-

-Corrections and additions are welcome! Notice that only those parts of implementations -that are currently available via http://grammaticalframework.org -are marked in the table -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ISOLanguageDarcsMiniParadLexLangAPISymbIrregDictTranstestedpublauthors
AfrAfrikaans+-+++++------*LP,LM
AmhAmharic++++++------+*MK
AraArabic++++-------+AD
BulBulgarian+++++++++++++*KA
CatCatalan+++++++++-+++-*JS,*IL
ChiChinese+-+++++--++-+ZL,*AR,*CP,QH
DanDanish+++++++++--+-*AR
DutDutch+++++++++-++-*AR,FJ
EngEnglish++++++++++++++*AR,BB,KA
EstEstonian+-+++++----++*KK,*IL
FinFinnish++++++++-+++++*AR,*IL
FreFrench+++++++++++++-*AR,RE
GerGerman+++++++++++++-*AR,HH,EG
GreGreek(mod)+-+++++-----+*IP
GrcGreek(anc)-----------+*HLe
HebHebrew+----------+*DD
HinHindi++++++++--+++*SV,*KP,MH,AR,PK
IceIcelandic+-+++++-----+*BT
InaInterlingua++++++-------JB
ItaItalian++++++++--+++-*AR,*RE,GP
JpnJapanese+-+++++---+++*LZ
LatLatin+-------+---*AR,*HLa
LavLatvian+-+++++----++*NG,*PP
MltMaltese++++++++----+*JC
MonMongolian+-+++++--+--+*NE
NepNepali++++++------+*DS
NnoNorwegian(n)+++++++++----*SRE
NorNorwegian(b)+++++++++--+-*AR
PesPersian+-++++----++*SV,*EA,SM
PnbPunjabi+++++++----+*SV,MH
PolPolish+++++++---++IN,*AS
RonRomanian++++++++---++*RE
RusRussian+++++++--++-+JK,*NF
SndSindhi++++++++----+*SV,*JD
SpaSpanish+++++++++-+++-*AR,IA,TS,*IL
SwaSwahili+----------+*WN,JM
SweSwedish++++++++++++++*MA,*AR,MF
ThaThai+-++++++--++-*AR,CK
TsnTswana------------*LPs,AB
TurTurkish+-+++----+---*SC,KA
UrdUrdu++++++++---++*SV,MH
- -

-ISO = 3-letter ISO language code, used in library file names -(mostly ISO 639-2 B (bibliographic)) -

-

-Darcs = available in the darcs repository of http://code.haskell.org/gf http://www.grammaticalframework.org/ -

-

-Mini = minimal resource, compiles with make minimal (obsolete) -

-

-Parad = Paradigms file complete for major POS, ++ means with smart paradigms -

-

-Lex = the resource Lexicon (nearly) complete -

-

-Lang = the resource Syntax (nearly) complete -

-

-API = the Syntax compiles -

-

-API = the Symbolic API compiles -

-

-Irreg = the Irreg module with irregular verbs exists -

-

-Dict = the Dict module, large-scale morphological lexicon, exists -

-

-Trans = large-scale translation module and dictionary exists -

-

-tested = tested in some applications, ++ means extensively tested with no major issues -

-

-publ = publications available, see RGL publication list -

-

-authors = main contributors, * means still active -(ready to fix bugs, answer to questions, etc) -

- -

Author codes

- -

-AB Ansu Berg, -AD Ali El Dada, -AR Aarne Ranta, -AS Adam Slaski, -BB Björn Bringert, -BT Bjarki Traustason, -CK Chotiros Kairoje, -CP Chen Peng, -DD Dana Dannélls, -DS Dinesh Simk, -EA Elnaz Abolahrar, -EG Erzsébet Galgóczy -FJ Femke Johansson, -HH Harald Hammarström, -HLa Herbert Lange, -HLe Hans Leiss, -GP Gabriele Paganelli, -IA Ingrid Andersson, -IL Inari Listenmaa, -IN Ilona Novak, -IP Ioanna Papadopoulou, -JB Jean-Philippe Bernardy, -JC John J. Camilleri, -JD Jherna Devi, -JK Janna Khegai, -JM Juliet Mutahi, -JS Jordi Saludes, -KA Krasimir Angelov, -KK Kaarel Kaljurand, -KP Kuchi Prasad, -LM Laurette Marais, -LP Laurette Pretorius, -LZ Liza Zimina, -MA Malin Ahlberg, -MF Markus Forsberg, -MK Markos Kassa Gobena, -MH Muhammad Humayoun, -NE Nyamsuren Erdenebadrakh, -NF Nick Frolov, -NG Normunds Gruzitis, -QH Qiao Haiyan, -RE Ramona Enache, -PP Peteris Paikens, -SC Server Cimen, -SM Sofy Moradi, -SRE Stian Rødven Eide, -SV Shafqat Virk, -TH Therese Söderberg, -WN Wanjiku Ng'ang'a, -ZL Zhuo Lin Qiqige -

- -

Rules

- -

-Only components available at http://grammaticalframework.org are indicated in the table -(exceptions: Ancient Greek, Mongolian, to appear soon). -

-

-If you want to work on a language already in the table, please be kind and contact the -active authors of it. -

-

-Feel free to start a new language that is not yet in the table - but let us know and -contribute some code as soon as you can! -

- - - - diff --git a/doc/.gitignore b/doc/synopsis/.gitignore similarity index 62% rename from doc/.gitignore rename to doc/synopsis/.gitignore index d8273c264..b0e9b0a7b 100644 --- a/doc/.gitignore +++ b/doc/synopsis/.gitignore @@ -1,5 +1,6 @@ +index.txt +index.html api-examples-*.txt api-examples.gfs categories-imagemap.html -synopsis.txt -synopsis.html +categories.png diff --git a/doc/synopsis/Makefile b/doc/synopsis/Makefile new file mode 100644 index 000000000..6e3a96269 --- /dev/null +++ b/doc/synopsis/Makefile @@ -0,0 +1,67 @@ +# Your GF_LIB_PATH must be set in order for this build script to work + +.PHONY: all index clean + +GF_alltenses=$(GF_LIB_PATH)/alltenses +GF=gf +GFDOC=gfdoc + +ROOT=../.. +S=$(ROOT)/src +CONFIG=$(ROOT)/languages.csv + +# List of languages extracted from languages.csv, with 'Synopsis' column == y +LANGS=$(shell cat $(CONFIG) | cut -d',' -f1,10 | grep ',y' | cut -d',' -f1) + +# This list was constructed by observing what files MkSynopsis.hs reads +SRC_FILES=$(S)/abstract/Common.gf $(S)/abstract/Cat.gf $(S)/api/Constructors.gf $(S)/abstract/Structural.gf $(patsubst %,$S/*/Paradigms%.gf,$(LANGS)) + +EXAMPLES_OUT=$(patsubst %,api-examples-%.txt,$(LANGS)) +INCLUDES=intro.txt categories-intro.txt categories-imagemap.html additional.txt browse.txt example.txt + +TMP=tmp.html +TEMPLATE=template.html + +all: index + +index: index.html + +index.txt: MkSynopsis.hs MkExxTable.hs $(INCLUDES) $(EXAMPLES_OUT) $(SRC_FILES) + runghc -i$(ROOT) MkSynopsis.hs + +index.html: index.txt $(TEMPLATE) + txt2tags --target=html --no-headers --quiet --toc --outfile=$@ --infile=$< + pandoc \ + --from=html \ + --to=html5 \ + --standalone \ + --template=$(TEMPLATE) \ + --css=synopsis.css \ + --metadata='title:"GF Resource Grammar Library: Synopsis"' \ + --variable='rel-root:$(ROOT)/..' \ + --output=$(TMP) \ + $@ + mv $(TMP) $@ + +categories.png: categories.dot + dot -Tpng $^ > $@ + +categories-imagemap.html: categories.dot + dot -Tcmapx $^ > $@ + +api-examples.gfs: api-examples.txt MkExx.hs + runghc MkExx.hs < $< > $@ + +# Since .gfo files aren't self-contained, the dependencies given here are +# incomplete. But I am thinking that the Try%.gfo file will always be newer +# than any other files it depends on, so the rule will trigger when +# needed anyway. //TH 2018-10-22 +api-examples-%.txt: $(GF_alltenses)/Try%.gfo api-examples.gfs + GF_LIB_PATH=$(GF_LIB_PATH) $(GF) -retain -s $< $@ + +clean: + rm -rf \ + index.txt \ + index.html \ + api-examples.gfs \ + $(EXAMPLES_OUT) diff --git a/doc/MkExx.hs b/doc/synopsis/MkExx.hs similarity index 100% rename from doc/MkExx.hs rename to doc/synopsis/MkExx.hs diff --git a/doc/MkExxTable.hs b/doc/synopsis/MkExxTable.hs similarity index 100% rename from doc/MkExxTable.hs rename to doc/synopsis/MkExxTable.hs diff --git a/doc/MkSynopsis.hs b/doc/synopsis/MkSynopsis.hs similarity index 95% rename from doc/MkSynopsis.hs rename to doc/synopsis/MkSynopsis.hs index e07dec14c..c7cd1f27f 100644 --- a/doc/MkSynopsis.hs +++ b/doc/synopsis/MkSynopsis.hs @@ -6,14 +6,17 @@ import Data.Char import Data.List import qualified Data.Map as M import Text.Printf -import Config +import Config (loadLangsFrom, LangInfo (..)) +import qualified Config type Cats = [(String,String,String)] type Rules = [(String,String,String)] -- the file generated -synopsis :: FilePath -synopsis = "synopsis.txt" +outfile :: FilePath +outfile = "index.txt" + +configFile = ".." ".." Config.configFile -- the language in which revealed examples are shown revealedLang :: String @@ -22,7 +25,7 @@ revealedLang = "Eng" -- all languages shown (a copy of this list appears in Makefile) apiExxFiles :: IO [FilePath] apiExxFiles = do - langs <- loadLangsFrom (".." configFile) + langs <- loadLangsFrom configFile return $ [ "api-examples-" ++ (langCode lang) ++ ".txt" | lang <- langs @@ -35,7 +38,7 @@ main = do cs1 <- getCats commonAPI cs2 <- getCats catAPI let cs = sortCats (cs1 ++ cs2) - writeFile synopsis "GF Resource Grammar Library: Synopsis" + writeFile outfile "GF Resource Grammar Library: Synopsis" space append "%!Encoding:utf-8" append "%!style(html): ./revealpopup.css" @@ -50,7 +53,7 @@ main = do append "%!postproc(html): '#LParadigms' ''" append "%!postproc(tex): '#LParadigms' ''" delimit $ addToolTips cs - include "synopsis-intro.txt" -- TODO dynamic language list + include "intro.txt" -- TODO dynamic language list title "Categories" space link "Source 1:" commonAPI @@ -87,13 +90,13 @@ main = do title "Lexical Paradigms" paradigmFiles >>= mapM_ (putParadigms cs) space - include "synopsis-additional.txt" + include "additional.txt" space - include "synopsis-browse.txt" + include "browse.txt" space title "An Example of Usage" space - include "synopsis-example.txt" + include "example.txt" space title "Table of Contents" space @@ -232,7 +235,7 @@ mkCatTable cs = inChunks chsize (\rs -> header ++ map mk1 rs) cs mk1 (name,expl,ex) = unwords ["|", showCat cs name, "|", expl, "|", typo ex, "|"] typo ex = if take 1 ex == "\"" then itf (init (tail ex)) else ex -srcPath = (() "../src") +srcPath = (() "../../src") commonAPI = srcPath "abstract/Common.gf" catAPI = srcPath "abstract/Cat.gf" @@ -241,7 +244,7 @@ structuralAPI = srcPath "abstract/Structural.gf" paradigmFiles :: IO [(String,FilePath)] paradigmFiles = do - langs <- loadLangsFrom (".." configFile) + langs <- loadLangsFrom configFile return $ [ (name, srcPath $ printf "%s/Paradigms%s.gf" (langDir lang) (langCode lang)) | lang <- langs @@ -263,7 +266,7 @@ splitOn f s = takeWhile (not.f) s : splitOn f rest "" -> [] _:xs -> xs -append s = appendFile synopsis ('\n':s) +append s = appendFile outfile ('\n':s) title s = append $ "=" ++ s ++ "=" stitle s = append $ "==" ++ s ++ "==" include s = append $ "%!include: " ++ s diff --git a/doc/synopsis-additional.txt b/doc/synopsis/additional.txt similarity index 100% rename from doc/synopsis-additional.txt rename to doc/synopsis/additional.txt diff --git a/doc/api-examples.txt b/doc/synopsis/api-examples.txt similarity index 100% rename from doc/api-examples.txt rename to doc/synopsis/api-examples.txt diff --git a/doc/synopsis-browse.txt b/doc/synopsis/browse.txt similarity index 100% rename from doc/synopsis-browse.txt rename to doc/synopsis/browse.txt diff --git a/doc/categories-intro.txt b/doc/synopsis/categories-intro.txt similarity index 100% rename from doc/categories-intro.txt rename to doc/synopsis/categories-intro.txt diff --git a/doc/categories.dot b/doc/synopsis/categories.dot similarity index 100% rename from doc/categories.dot rename to doc/synopsis/categories.dot diff --git a/doc/synopsis-example.txt b/doc/synopsis/example.txt similarity index 100% rename from doc/synopsis-example.txt rename to doc/synopsis/example.txt diff --git a/doc/synopsis-intro.txt b/doc/synopsis/intro.txt similarity index 100% rename from doc/synopsis-intro.txt rename to doc/synopsis/intro.txt diff --git a/doc/quicklinks.js b/doc/synopsis/quicklinks.js similarity index 100% rename from doc/quicklinks.js rename to doc/synopsis/quicklinks.js diff --git a/doc/synopsis.css b/doc/synopsis/synopsis.css similarity index 100% rename from doc/synopsis.css rename to doc/synopsis/synopsis.css diff --git a/doc/_template.html b/doc/synopsis/template.html similarity index 100% rename from doc/_template.html rename to doc/synopsis/template.html diff --git a/doc/translation.html b/doc/translation.html deleted file mode 100644 index 64b621ea9..000000000 --- a/doc/translation.html +++ /dev/null @@ -1,329 +0,0 @@ - - - - - -From Resource Grammar to Wide Coverage Translation with GF - -
-

From Resource Grammar to Wide Coverage Translation with GF

-Aarne Ranta et al.
-January-May 2014 -
- - -

Scope

- -

-Wide-coverage interlingual translator for -Bulgarian, Chinese, Dutch, English, Finnish, French, German, -Hindi, Italian, Spanish, Swedish. -

- -

How to use it

- -

-If you just want to try it before reading more, -here are the main ways to get started: -

-

-1. Run on our server. http://www.grammaticalframework.org/demos/translation.html -

-

-2. Get an Android app. http://www.grammaticalframework.org/demos/app.html -

-

-3. Compile and run in the shell. Get the latest GF sources (with darcs or github) and then -

- -
    -
  • compile and install the GF compiler and library and the C runtime (pgf-translate). -

    -
  • compile the translator: - -
    -    cd GF/lib/src
    -    make -j Translate11.pgf
    -
    - -This will take a long time (fifteen minutes or more) and will probably require at least 8GB of RAM. -

    -
  • run the translator - -
    -    pgf-translate Translate11.pgf Phr TranslateEng TranslateSwe
    -
    - -with obviously the possibility to vary the source and the target language. -
- -

-4. To modify the sources, work on the files in -

- -
-    GF/lib/src/translator/
-
- -

-It is these files that will be explained below. -

- -

GF and the RGL

- -

-GF, Grammatical Framework, was originally designed for the purpose of multilingual controlled language systems, -which would enable high-quality translation on limited domains. The abstract syntax of GF defines the semantic -structures relevant for the domain, and the concrete syntaxes map these structures to grammatically correct -and idiomatic text in each target language. The reversibility of GF enables both generation and parsing, -and thereby translation where the abstract syntax functions as an interlingua. -

-

-As a bottle-neck of GF applications, it was soon realized that the definition of concrete syntax requires a lot -of manual work and linguistic skill, because of the complexities of natural language syntax and morphology. Some of -the complexities can be ignored in a small system. For instance, in a mathematical system, it may be enough to -use verbs in the present tense only. But very much the same linguistic problems must be solved again and again -in new applications: French verb inflection is the same in mathematics as in a tourist phrasebook. To solve -this problem, the GF Resource Grammar Library (RGL) was developed, to take care of "low-level" linguistic -rules such as inflection, agreement, and word order. This enables the authors of application grammars to focus -on the semantics (when designing the abstract syntax) and on selecting RGL functions that produce the idioms they -want. The RGL grew into an international open-source project, where more than 50 persons have contributed to -implementing it for 29 languages by the time of writing this. -

- -

Scaling up GF translation

- -

-The RGL was thus originally designed to be used just as its name says: as a library -for application grammars. Only the latter were meant to be used as top-level grammars, i.e. for -parsing, generation, and translation at run time. Little attention was therefore -paid to the usability of RGL as a top-level -grammar by itself. But when applications accumulated, ranging from technical text to spoken dialogue, the coverage -of the RGL grew into a coverage that approximates a "complete grammar" of many of the languages. -And recently, there has indeed been success in using the RGL as a wide-coverage translation grammar, -mainly due to Krasimir Angelov's efforts to scale up the size of GF applications from language fragments -to open-text processing. This success is a result of four lines of development: -

- -
    -
  • More efficient processing, both due to better algorithms and to an optimized C implementation of a PGF - interpreter, the C runtime, achieving speeds competitive with the state of the art, e.g. the Stanford parser. - This development is also based on the work of Peter Ljunglöf on GF parsing and Lauri Alanko on the C runtime. -

    -
  • Large-scale dictionaries, both manually built and extracted from free sources, and linked into a multilingual - translation dictionary now covering 10k to 60k entries for eleven languages. This work was started by Björn Bringert, - who ported the Oxford Advanced Learner's Dictionary of English to GF. -

    -
  • Probabilistic disambiguation, using a model trained from the Penn Treebank. Due to the common abstract syntax, - the same model can be used for other languages as well, even though the adequacy of this transfer has not - been systematically evaluated. -

    -
  • Robust parsing, which recovers from unknown words and syntax - by using chunk-by-chunk translations. This leads to loss of quality, but fulfills the principle that - "something is better than nothing". -
- -

Remaining problems

- -

-The result of all this work is a wide-coverage translation system, which can be used in the same way as Google -Translate, Bing, Systran, and Apertium - to "translate anything", albeit with a varying quality. At the moment of -writing, the performance is not yet generally on the level with the best of the competition, but shows some promising -improvements in e.g. long-distance agreement and word order. To make these advantages into absolute improvements, we -will need to fix problems that the other systems (or at least some of them) get right but where GF translation -often fails: -

- -
    -
  • Lexical coverage, to eliminate parsing failures due to unknown words. -

    -
  • Disambiguation, with more sophisticated than the essentially context-free tree model used now. -

    -
  • Speed, which gets worse with long sentences and with more complex languages. -

    -
  • Idiomacy, due to the lack of idiomatic constructions that are not compositional and therefore don't get right - in the RGL but are often correct in phrase-based SMT. -
- -

Advantages of GF translation

- -

-Given that these issues get resolved, the strengths of the GF approach can be made more visible: -

- -
    -
  • Grammaticality, in particular the already mentioned issues of agreement and word order. -

    -
  • Predictability, in the sense that a local change in the input usually results in a corresponding - local change in the output (unless otherwise required by idiomacy). -

    -
  • Feedback, i.e. the ease of showing the confidence level of the translation, alternative translations, - and linguistic information. -

    -
  • Adaptability, i.e. the ease of fixing bugs, adapting the system to special domains, and personalizing it. - This can be done with great precision. For instance, a bug in a grammar can be fixed without - breaking anything else. -

    -
  • Light weight. The system runs on standard laptops and even on mobile phones; the size of the run-time - system for all pairs of 11 languages is under 25MB (on the Android platform), and recompiling the whole - system (e.g. after bug fixes or - domain adaptation) is a matter of a few minutes, where corresponding figures for SMT systems are gigabytes of size - and days of retraining. -

    -
  • Multilinguality, in the sense that once the parsing of the input is settled, the output can be readily - rendered into all other languages, - and also in the sense that the GF model works equally well for any language pair. -
- -

Wanted: more work, new ideas

- -

-The recipes for improvement are, as always, more work and new ideas. Each of the four weaknesses mentioned -above can be relieved by more work - in particular, lexical coverage by more work on the lexicon, since -automatic extraction methods cannot really be trusted. As for disambiguation, new ideas about probabilistic -tree models are being discussed. As for speed, new ideas on parsing (in particular, the integration of disambiguation -with parsing) would help, but also the complexity of grammatical structures plays a major role. As for idiomacy, -more work is being done in introducing constructions (non-compositional syntax rules, generalizing the notion of -multiword expressions, in particular, phrases in SMT), but also new ideas are being discussed on how to -extract such constructions from e.g. phrase tables. -

-

-In the following, we will focus on describing the role of grammar in the GF translation system - in particular, how -RGL can be modified to become usable as a top-level grammar for translating open text. -As RGL was not meant to be used for parsing open text, but rather for the controlled language generation task, -it has serious restrictions: -

- -
    -
  • Limited coverage. The RGL does not cover all structures in any language - hence it is likely to fail when - parsing unlimited text. -

    -
  • Semantic overgeneration. Semantic distinctions, such as between mass and count nouns, or place and manner - adverbials, are assumed to be defined in application grammars; the RGL just defines the combinatorics of - elements, but doesn't prescribe which elements can really go together. -

    -
  • Spurious ambiguities. RGL parsing creates more ambiguities than what would be necessary, if there - was more semantic control. In addition, there are partly overlapping structures, which generate - spurious syntactic ambiguities. - Example: the very liberal apposition function. -

    -
  • Inefficiency. Partly because of ambiguities, partly of the deep nesting and complex data structures, parsing - with the RGL can be very slow when compared to application grammars, even the comprehensive ResourceDemo grammar. - For some languages (Romanian, versions of French and Finnish), parsing is not practically possible at all because - PGF generation fails for memory reasons. -

    -
  • Syntax orientation. The structures of the RGL are rather superficial and don't guarantee translation - equivalence when used as interlingua. -

    -
  • Coarse categories. This is a particular aspect of syntax orientation, and causes at the same time overgeneration - and spurious ambiguities. - Example: the category Adv. -
- -

What speaks for using RGL

- -

-Despite these problems, the RGL has shown to be a possible starting point for large-scale translation. It has a couple -of advantages speaking for this: -

- -
    -
  • Coverage. Even though not complete, the RGL has grown into a coverage that is close to complete enough; work - with English shows that just about 20% more constructions can take us there. -

    -
  • Maintainability. The RGL is constantly developed and maintained on its own right, and it makes sense to take - advantage of this and avoid duplicated work with some other large-scale grammar. -
- -

-Of course, we are still left with the other -option of addressing translation with an application grammar, something -similar to the ResourceDemo with flatter and more semantic structures. But this would in turn require -the replication of many rules, even though it would be to a large extent doable by using a functor, that is, -by just one set of rules covering all languages. -

- -

The structure of the wide-coverage translation grammar

- -

-Thus the path chosen is a mixture of RGL and application grammar. In brief, the translation grammar consists of -

- -
    -
  • Selected RGL modules and functions, as they are (using restricted inheritance); around 80% of the syntax. -

    -
  • Overridden RGL functions, with more general types; just a few of them. -

    -
  • Overridden RGL linearizations, typically with more variants in individual languages; just a few, but - increasing. -

    -
  • Syntax extension, new categories and functions, around 20% of the syntax, and increasing. -

    -
  • Big lexicon, with an abstract syntax of 65k lemmas, increasing. -

    -
  • Constructions, inspired by (and partly derived from) Construction Grammars, to capture idioms that - involve specific lexical items and are therefore "between the syntax and the lexicon". -
- -

-The following picture shows the principal module structure of the translation grammar. -

-

- -

-

-Here is a description of each of the modules: -

- -
    -
  • Translate is the top module, which combines the RGL syntax with syntax extensions and a dictionary. - The RGL syntax is not inherited in its entirety, which is indicated by a dashed line. The overridden abstract - syntax functions (common to all languages) are replaced by functions in the Extensions module, whereas the - overridden concrete syntax definitions (specific to each language) are defined in this Translate module. - This consists of the module named Translate. -

    -
  • RGLSyntax stands for the standard RGL module for syntax, excluding the RGL test lexicon and - the language-specific extensions of it. This consists of the standard module named Grammar and - the emerging module named Construction. -

    -
  • Extensions stands for the syntax extensions added to the RGL syntax. This consists of the module - named Extensions. -

    -
  • Dictionary is a large-scale multilingual dictionary. Its abstract syntax uses as identifiers English words - suffixed by categories and word sense information. This consists of the module named Dictionary. -

    -
  • RGLCategories stands for the type system of the standard RGL, the module named Cat. -

    -
  • Chunk is the grammar defining what chunks (noun phrases, verbs, - adverbs, etc) can be used and how they are combined, when exact - syntactic combination fails. -
- -

Where and why the translation grammar differs from the RGL

- -

-A guiding principle is thus that the translation grammar preserves as much as possible of the RGL, so that -duplicated work is avoided. But as the purposes of the two are different, not everything is possible. Two -diverging principles have already been mentioned: -

- -
    -
  • Free variation. The RGL bans free variation, because library users need to have full control on selecting - variants. For instance, English negation has two forms, contracted (don't) and uncontracted (do not), - which in the translation grammar are treated as variants. But RGL users sometimes need to choose the one or the - other, for instance, excluding contracted negation in formal style. -

    -
  • Semantic distinctions. The RGL avoids semantic distinctions that are not absolutely necessary for syntax. - The reason for this is the ambition to keep the library as simple as possible, in particular for the voluntary - implementors of new languages. But meaning-preserving translation needs more distinctions, for instance, in - word senses, subcategorizations, selection restrictions, and tense and aspect systems. -
- -

-The old design principles of the RGL are thus kept in force, and this is made possible by separating parts of the -translation grammar modules from the RGL. -

- - - -