diff --git a/doc/python-api.html b/doc/runtime-api.html similarity index 59% rename from doc/python-api.html rename to doc/runtime-api.html index 0cc4f2701..015c3d372 100644 --- a/doc/python-api.html +++ b/doc/runtime-api.html @@ -1,82 +1,189 @@
+ + -+Before you use the Python binding you need to import the PGF2 modulepgf modulepgf package. +>>> import pgf++Prelude> import PGF2 +++import org.grammaticalframework.pgf.*; +-Once you have the module imported, you can use the dir and +Once you have the module imported, you can use the dir and help functions to see what kind of functionality is available. dir takes an object and returns a list of methods available in the object: -+-A grammar is loaded by calling the method readPGF: ->>> dir(pgf)help is a little bit more advanced and it tries to produce more human readable documentation, which more over contains comments: -+>>> help(pgf)++A grammar is loaded by calling the method pgf.readPGFthe function readPGFthe method PGF.readPGFthe method PGF.ReadPGF: +>>> gr = pgf.readPGF("App12.pgf")++Prelude PGF2> gr <- readPGF "App12.pgf" +++PGF gr = PGF.readPGF("App12.pgf") +From the grammar you can query the set of available languages. It is accessible through the property languages which -is a map from language name to an object of class pgf.Concr +is a map from language name to an object of class pgf.Concrtype Concrclass Concr which respresents the language. For example the following will extract the English language: -+>>> eng = gr.languages["AppEng"] >>> print(eng) <pgf.Concr object at 0x7f7dfa4471d0>++Prelude PGF2> let Just eng = Data.Map.lookup "AppEng" (languages gr) +Prelude PGF2> :t eng +eng :: Concr +++Concr eng = gr.getLanguages().get("AppEng") +Parsing
-All language specific services are available as methods of the -class pgf.Concr. For example to invoke the parser, you -can call: -+All language specific services are available as +methods of the class pgf.Concrfunctions that take as an argument an object of type Concrmethods of the class Concr. +For example to invoke the parser, you can call: +>>> i = eng.parse("this is a small theatre")-This gives you an iterator which can enumerates all possible -abstract trees. You can get the next tree by calling next: -++Prelude PGF2> let res = parse eng (startCat gr) "this is a small theatre" +++Iterable<ExprProb> iterable = eng.parse(gr.startCat(), "this is a small theatre") ++ +This gives you an iterator which can enumerate all possible +abstract trees. You can get the next tree by calling next: +>>> p,e = i.next()or by calling __next__ if you are using Python 3: -++ +This gives you a result of type Either String [(Expr, Float)]. +If the result is Left then the parser has failed and you will +get the token where the parser got stuck. If the parsing was successful +then you get a potentially infinite list of parse results: +>>> p,e = i.__next__()-The results are always pairs of probability and tree. The probabilities -are negated logarithmic probabilities and which means that the lowest ++Prelude PGF2> let Right ((p,e):rest) = res ++ + +This gives you an iterable which can enumerate all possible +abstract trees. You can get the next tree by calling next: ++Iterator<ExprProb> iter = iterable.iterator() +ExprProb ep = iter.next() ++ + +The results are pairs of probability and tree. The probabilities +are negated logarithmic probabilities and this means that the lowest number encodes the most probable result. The possible trees are returned in decreasing probability order (i.e. increasing negated logarithm). The first tree should have the smallest p: -
+ +>>> print(p) 35.9166526794++Prelude PGF2> print p +35.9166526794 +++System.out.println(ep.getProb()) +35.9166526794 +and this is the corresponding abstract tree: -+>>> print(e) PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (PredVP (DetNP (DetQuant this_Quant NumSg)) (UseComp (CompNP (DetCN (DetQuant IndefArt NumSg) (AdjCN (PositA small_A) (UseN theatre_N)))))))) NoVoc++Prelude PGF2> print e +PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (PredVP (DetNP (DetQuant this_Quant NumSg)) (UseComp (CompNP (DetCN (DetQuant IndefArt NumSg) (AdjCN (PositA small_A) (UseN theatre_N)))))))) NoVoc +++System.out.println(ep.getExpr()) +PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (PredVP (DetNP (DetQuant this_Quant NumSg)) (UseComp (CompNP (DetCN (DetQuant IndefArt NumSg) (AdjCN (PositA small_A) (UseN theatre_N)))))))) NoVoc ++Note that depending on the grammar it is absolutely possible that for +a single sentence you might get infinitely many trees. +In other cases the number of trees might be finite but still enormous. +The parser is specifically designed to be lazy, which means that +each tree is returned as soon as it is found before exhausting +the full search space. For grammars with a patological number of +trees it is advisable to pick only the top N trees +and to ignore the rest.
+ + The parse method has also the following optional parameters:-By using these parameters it is possible for instance to change the start category for +
@@ -85,21 +192,38 @@ The parse method has also the following optional parameters: cat start category callbacks a list of category and callback function By using these parameters it is possible for instance to change the start category for the parser or to limit the number of trees returned from the parser. For example -parsing with a different start category can be done as follows: -
->>> i = eng.parse("a small theatre", cat="NP") +parsing with a different start category can be done as follows: ++ +There is also the function parseWithHeuristics which +takes two more paramaters which let you to have a better control +over the parser's behaviour: ++>>> i = eng.parse("a small theatre", cat=pgf.readType("NP"))++let res = parseWithHeuristics eng (startCat gr) heuristic_factor callbacks ++ + +There is also the method parseWithHeuristics which +takes two more paramaters which let you to have a better control +over the parser's behaviour: ++Iterable<ExprProb> iterable = eng.parseWithHeuristics(gr.startCat(), heuristic_factor, callbacks) ++The heuristics factor can be used to trade parsing speed for quality. -By default the list of trees is sorted by probability this corresponds +By default the list of trees is sorted by probability and this corresponds to factor 0.0. When we increase the factor then parsing becomes faster but at the same time the sorting becomes imprecise. The worst factor is 1.0. In any case the parser always returns the same set of trees but in different order. Our experience is that even a factor -of about 0.6-0.8 with the translation grammar, still orders -the most probable tree on top of the list but further down the list +of about 0.6-0.8 with the translation grammar still orders +the most probable tree on top of the list but further down the list, the trees become shuffled.
@@ -115,38 +239,85 @@ You can either linearize the result from the parser back to another language, or you can explicitly construct a tree and then linearize it in any language. For example, we can create a new expression like this: -+>>> e = pgf.readExpr("AdjCN (PositA red_A) (UseN theatre_N)")++Prelude PGF2> let Just e = readExpr "AdjCN (PositA red_A) (UseN theatre_N)" +++Expr e = Expr.readExpr("AdjCN (PositA red_A) (UseN theatre_N)") +and then we can linearize it: -+>>> print(eng.linearize(e)) red theatre++Prelude PGF2> putStrLn (linearize eng e) +red theatre +++System.out.println(eng.linearize(e)) +red theatre +This method produces only a single linearization. If you use variants in the grammar then you might want to see all possible linearizations. For that purpouse you should use linearizeAll: -+>>> for s in eng.linearizeAll(e): print(s) red theatre red theater++Prelude PGF2> mapM_ putStrLn (linearizeAll eng e) +red theatre +red theater +++for (String s : eng.linearizeAll(e)) { + System.out.println(s) +} +red theatre +red theater +If, instead, you need an inflection table with all possible forms -then the right method to use is tabularLinearize: -+then the right method to use is tabularLinearize: +>>> eng.tabularLinearize(e): {'s Sg Nom': 'red theatre', 's Pl Nom': 'red theatres', 's Pl Gen': "red theatres'", 's Sg Gen': "red theatre's"}++Prelude PGF2> tabularLinearize eng e +{'s Sg Nom': 'red theatre', 's Pl Nom': 'red theatres', 's Pl Gen': "red theatres'", 's Sg Gen': "red theatre's"} +++for (Map.Entry<String,String> entry : eng.tabularLinearize(e)) { + System.out.println(entry.getKey() + ": " + entry.getValue()); +} +s Sg Nom: red theatre +s Pl Nom: red theatres +s Pl Gen: red theatres' +s Sg Gen: red theatre's +Finally, you could also get a linearization which is bracketed into a list of phrases: -
+>>> [b] = eng.bracketedLinearize(e) >>> print(b) (CN:4 (AP:1 (A:0 red)) (CN:3 (N:2 theatre)))++Prelude PGF2> let [b] = bracketedLinearize eng e +Prelude PGF2> print b +(CN:4 (AP:1 (A:0 red)) (CN:3 (N:2 theatre))) +++Object[] bs = eng.bracketedLinearize(e) +Each bracket is actually an object of type pgf.Bracket. The property cat of the object gives you the name of the category and the property children gives you a list of nested brackets. @@ -161,9 +332,15 @@ that doesn't have linearization definitions. In that case you will just see the name of the function in the generated string. It is sometimes helpful to be able to see whether a function is linearizable or not. This can be done in this way: -+>>> print(eng.hasLinearization("apple_N"))++Prelude PGF2> print (hasLinearization eng "apple_N") +++System.out.println(eng.hasLinearization("apple_N")) +Analysing and Constructing Expressions
@@ -171,7 +348,7 @@ is linearizable or not. This can be done in this way: An already constructed tree can be analyzed and transformed in the host application. For example you can deconstruct a tree into a function name and a list of arguments: -+>>> e.unpack() ('AdjCN', [<pgf.Expr object at 0x7f7df6db78c8>, <pgf.Expr object at 0x7f7df6db7878>])@@ -181,7 +358,7 @@ tree. If the tree is a function application then you always get a tuple of function name and a list of arguments. If instead the tree is just a literal string then the return value is the actual literal. For example the result from: -+>>> pgf.readExpr('"literal"').unpack() 'literal'@@ -200,7 +377,7 @@ will be called each time when the corresponding function is encountered, and its arguments will be the arguments from the original tree. If there is no matching method name then the runtime will to call the method default. The following is an example: -+>>> class ExampleVisitor: def on_DetCN(self,quant,cn): print("Found DetCN") @@ -229,7 +406,7 @@ Constructing new trees is also easy. You can either use readExpr to read trees from strings, or you can construct new trees from existing pieces. This is possible by using the constructor for pgf.Expr: -+>>> quant = pgf.readExpr("DetQuant IndefArt NumSg") >>> e2 = pgf.Expr("DetCN", [quant, e]) >>> print(e2) @@ -246,14 +423,14 @@ the grammar you can call the method embed, which will dynamically create a Python module with one Python function for every function in the abstract syntax of the grammar. After that you can simply import the module: -+>>> gr.embed("App") <module 'App' (built-in)> >>> import AppNow creating new trees is just a matter of calling ordinary Python functions: -+>>> print(App.DetCN(quant,e)) DetCN (DetQuant IndefArt NumSg) (AdjCN (PositA red_A) (UseN house_N))@@ -264,13 +441,13 @@ There are two methods that gives you direct access to the morphological lexicon. The first makes it possible to dump the full form lexicon. The following code just iterates over the lexicon and prints each word form with its possible analyses: -+for entry in eng.fullFormLexicon(): print(entry)The second one implements a simple lookup. The argument is a word form and the result is a list of analyses: -+print(eng.lookupMorpho("letter")) [('letter_1_N', 's Sg Nom', inf), ('letter_2_N', 's Sg Nom', inf)]@@ -279,22 +456,22 @@ print(eng.lookupMorpho("letter")) There is a simple API for accessing the abstract syntax. For example, you can get a list of abstract functions: -+>>> gr.functions ....or a list of categories: -+>>> gr.categories ....You can also access all functions with the same result category: -+>>> gr.functionsByCat("Weekday") ['friday_Weekday', 'monday_Weekday', 'saturday_Weekday', 'sunday_Weekday', 'thursday_Weekday', 'tuesday_Weekday', 'wednesday_Weekday']The full type of a function can be retrieved as: -+>>> print(gr.functionType("DetCN")) Det -> CN -> NP@@ -304,7 +481,7 @@ Det -> CN -> NPThe runtime type checker can do type checking and type inference for simple types. Dependent types are still not fully implemented in the current runtime. The inference is done with method inferExpr: -
+>>> e,ty = gr.inferExpr(e) >>> print(e) AdjCN (PositA red_A) (UseN theatre_N) @@ -318,13 +495,13 @@ wouldn't be true when dependent types are added.Type checking is also trivial: -
+>>> e = gr.checkExpr(e,pgf.readType("CN")) >>> print(e) AdjCN (PositA red_A) (UseN theatre_N)In case of type error you will get an exception: -+>>> e = gr.checkExpr(e,pgf.readType("A")) pgf.TypeError: The expected type of the expression AdjCN (PositA red_A) (UseN theatre_N) is A but CN is infered@@ -339,7 +516,7 @@ inconvinient because loading becomes slower and the grammar takes more memory. For that purpose you could split the grammar into one file for the abstract syntax and one file for every concrete syntax. This is done by using the option -split-pgf in the compiler: -+$ gf -make -split-pgf App12.pgf@@ -347,13 +524,13 @@ Now you can load the grammar as usual but this time only the abstract syntax will be loaded. You can still use the languages property to get the list of languages and the corresponding concrete syntax objects: -+>>> gr = pgf.readPGF("App.pgf") >>> eng = gr.languages["AppEng"]However, if you now try to use the concrete syntax then you will get an exception: -+>>> gr.languages["AppEng"].lookupMorpho("letter") Traceback (most recent call last): File "Before using the concrete syntax, you need to explicitly load it: -", line 1, in @@ -361,7 +538,7 @@ pgf.PGFError: The concrete syntax is not loaded +>>> eng.load("AppEng.pgf_c") >>> print(eng.lookupMorpho("letter")) [('letter_1_N', 's Sg Nom', inf), ('letter_2_N', 's Sg Nom', inf)] @@ -369,7 +546,7 @@ Before using the concrete syntax, you need to explicitly load it: When you don't need the language anymore then you can simply unload it: -+>>> eng.unload()@@ -379,7 +556,7 @@ GraphViz is used for visualizing abstract syntax trees and parse trees. In both cases the result is a GraphViz code that can be used for rendering the trees. See the examples bellow. -+>>> print(gr.graphvizAbstractTree(e)) graph { n0[label = "AdjCN", style = "solid", shape = "plaintext"] @@ -394,7 +571,7 @@ n0 -- n3 [style = "solid"] }-+>>> print(eng.graphvizParseTree(e)) graph { node[shape=plaintext] diff --git a/index.html b/index.html index d38227c1a..18f8c8bce 100644 --- a/index.html +++ b/index.html @@ -90,9 +90,7 @@ function sitesearch() {Develop Applications