Don't use string sharing in LexGF.

Profiling showed that when loading a large .gfo file, shareString was responsible for
15-18% of the CPU time, and a lot of the allocation. Since we already use ByteStrings for 
reading the source files, shareString mostly has the effect of creating lots 
of small ByteStrings instead of one large one. Since the plain size of the .gfo is seldom
a problem (unlike when it was read as a String), it is ok to keep the whole file 
as one ByteString in RAM, and have all tokens point into that.
Profiling after the change showed 15-20% reduction in CPU time and in total allocation.
This commit is contained in:
bjorn
2008-11-26 16:19:54 +00:00
parent 5dee98234e
commit 720932d751
3 changed files with 5 additions and 5 deletions

View File

@@ -4,7 +4,7 @@
{-# OPTIONS -fno-warn-incomplete-patterns #-}
module GF.Source.LexGF where
import GF.Source.SharedString
import qualified Data.ByteString.Char8 as BS
#if __GLASGOW_HASKELL__ >= 603
@@ -44,7 +44,7 @@ alex_accept = listArray (0::Int,34) [[],[],[(AlexAccSkip)],[(AlexAccSkip)],[],[(
tok f p s = f p s
share :: BS.ByteString -> BS.ByteString
share = shareString
share = id
data Tok =
TS !BS.ByteString !Int -- reserved words and symbols

View File

@@ -4,7 +4,7 @@
{-# OPTIONS -fno-warn-incomplete-patterns #-}
module GF.Source.LexGF where
import GF.Source.SharedString
import qualified Data.ByteString.Char8 as BS
}
@@ -39,7 +39,7 @@ $d+ \. $d+ (e (\-)? $d+)? { tok (\p s -> PT p (TD $ share s)) }
tok f p s = f p s
share :: BS.ByteString -> BS.ByteString
share = shareString
share = id
data Tok =
TS !BS.ByteString !Int -- reserved words and symbols

View File

@@ -1,5 +1,5 @@
all:
cd ../.. && bnfc -p GF.Source -bytestrings -sharestrings GF/Source/GF.cf
cd ../.. && bnfc -p GF.Source -bytestrings GF/Source/GF.cf
rm ErrM.hs
perl -i -pe 's/%name pModHeader ModHeader/%partial pModHeader ModHeader/' ParGF.y
perl -i -pe 's/GF.Source.ErrM/GF.Data.ErrM/' *.hs *.x *.y