|
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectvn.Cyc
public class Cyc
This class implements the mapping of VerbNet verb-frame pairs to Cyc
verbSemTrans
rules. This is the bulk of the VxC application. A better name for this class
might have been Mapper or CycMapper since this class is not merely the embodiment
of the Cyc data, but Cyc was chosen for simplicity.
VxC uses the Inspector extension method
of writing as little code as possible in the EventManager
class so that
this class, along with Matcher
and PrepositionManager
,
are the embodiment of the extension.
The VxC application accepts 5 additional command-line options (view usage note for more information):
verbSemTrans
rules.
A Cyc input file supplemental/cyc-rules.txt
is provided in the vxc.tar.gz
download file.
More information on this file can be found here: cycFile
.supplemental/manual-mapping.xml
is provided in the vxc.tar.gz
download file.
vxc.tar.gz
download file can be found
here.EventManager
class.
The Inspector system (running in verb-frame pair mode) scans the VerbNet XML files and fire events
when it encounters various elements. When it encounters a verb-frame pair it calls
findCycMatches(String, String, String, int, String)
to search for matches with the Cyc rules.
A match is determined by applying all the constraints supplied (via the -A operator) and if the possible
match has not been discarded, it is assumed to be a good match. At the end of the program
all the results of the automatic matching are printed to stdout.
Symbol | Name | Description |
0 | null match | excludes all possible matches |
Matcher.nullMatch() | ||
n | naive match | excludes only those possible matches whose lemmas are not equal |
Matcher.naive(String, String) | ||
p | preposition | excludes those possible matches where VerbNet does not have required Cyc preposition |
Matcher.cycPreposition(String, String) | ||
t | transitivity | excludes those possible matches where VerbNet's and Cyc's transitivity do not correspond |
Matcher.transitivity(String, String) | ||
i | infinitive/gerund | excludes those possible matches where VerbNet's and Cyc's infinitives or gerunds do not correspond |
Matcher.infinitiveGerund(String, String) | ||
j | adjective | excludes those possible matches where VerbNet's and Cyc's adjectives do not correspond |
Matcher.adjective(String, String) | ||
f | fromLocation implies Source | excludes those possible matches where Cyc uses fromLocation and VerbNet does not use Source |
Matcher.fromLocationImpliesSource(String, String) | ||
d | dbpb implies Agent | excludes those possible matches where Cyc uses doneBy or performedBy and VerbNet does not use Agent |
Matcher.DBPBimpliesAgent(String, String) | ||
m | middle voice implies no Agent | excludes those possible matches where the Cyc rule is a MiddleVoiceFrame but VerbNet uses Agent |
Matcher.middleVoiceNoAgent(String, String) | ||
a | all | applies all above constraints except 0, in the above order |
Flag | Name | Variable |
c | show class match counts | flShowClassMatchCounts |
d | show discards | flShowDiscards |
m | manual mapping classes only | flManualClassesOnly |
29,245 verb-frame pairs x 3,256 Cyc rules = 95,221,720 possible matches
26,128 naive matches
Inspector
,
EventManager
Nested Class Summary | |
---|---|
private static class |
Cyc.InvalidCycInputFileException
Exception class for identifying if the Cyc input file is invalid. |
private static class |
Cyc.Match
This class represents a match between a VerbNet verb-frame pair and a Cyc rule. |
Field Summary | |
---|---|
(package private) static String |
ALL_MATCH_CONSTRS
All constraint symbols used with the -A operator. |
private static ArrayList |
AM
All matches found by the automatic matching algorithm. |
private static int |
classMatchCount
The number of possible matches that the automatic matching algorithm has marked as 'good' in the current class or subclass. |
private static String |
curClass
The current class or subclass that the Inspector is scanning. |
(package private) static File |
cycFile
The file containing all the verbSemTrans rules from Cyc. |
(package private) static File |
cycFileCmp
The compressed version of the Cyc input file ( cycFile ). |
private static ArrayList[] |
cycRules
All Cyc rules. |
(package private) static String |
flags
A string representation of all of the flags supplied on the command line. |
(package private) static boolean |
flManualClassesOnly
A flag set in Inspector.analyzeArguments(String[]) and
used to indicate that the user would like the VxC system that
classes not covered by the manual mapping should be ignored (this will save you time)
Most of the final results are based off of the manual mapping. |
(package private) static boolean |
flShowClassMatchCounts
A flag set in Inspector.analyzeArguments(String[]) and
used to indicate that the user would like the VxC system that the
number of good matches found in each class and each subclass should
be printed. |
(package private) static boolean |
flShowDiscards
A flag set in Inspector.analyzeArguments(String[]) and
used to indicate that the user would like the VxC system to
print a line of text to stderr for each possible match that was
discarded. |
private static boolean |
isManualClass
Whether or not the current class is covered by the manual mapping file. |
private static ArrayList |
manClasses
The list of classes and subclasses covered by the manual mapping. |
(package private) static File |
manFile
The manual mapping file. |
private static String |
manualVersion
The VerbNet and Cyc versions to which the manual mapping file corresponds. |
(package private) static String |
matchConstraints
A string representation of all of the constraints supplied on the command line. |
(package private) static File |
matchFile
The file to which matches found by the automatic matching algorithm should be written. |
private static PrintWriter |
matchpw
The PrintWriter object tied to the match
output file. |
private static ArrayList |
MM
All matches specified in the manual mapping file. |
private static int |
numAMDiscardsInManClasses
The number of rules discarded by the automatic matching algorithm from classes covered by the manual mapping file. |
Constructor Summary | |
---|---|
private |
Cyc()
This constructor is private because the class is not intended to ever be instantiated. |
Method Summary | |
---|---|
private static void |
addRuleNumbersAndCompress()
Compresses the original Cyc input file into another file which has one line per verbSemTrans rule. |
(package private) static void |
closeExternalData()
Writes the end tag of the document element to the match output file and closes the stream. |
(package private) static void |
compareMatches()
Calculates all statistics and displays the final results section. |
(package private) static String |
cycExtractParticle(String rule)
Returns the particle in a particle construction (i.e. "off" in "shear off"). |
(package private) static String |
cycExtractPreposition(String rule)
Returns the preposition specified by a Cyc rule. |
(package private) static String |
cycExtractRuleNum(String rule)
Returns the rule number applied to this Cyc rule in addRuleNumbersAndCompress() . |
(package private) static String |
cycExtractTrans(String rule)
Returns the transitivity of this Cyc rule represented by a small handful of characters. |
private static String |
cycExtractVerb(String rule)
Returns the verb of a Cyc rule. |
(package private) static boolean |
cycHasParticle(String rule)
Returns whether or not a Cyc rule deals with a particle construction (i.e. "shear off"). |
(package private) static boolean |
cycHasPreposition(String rule)
Returns whether or not a Cyc rule specifies a preposition (i.e. "bicker with"). |
private static void |
eprint(String s)
Used as shorthand for System.err.print . |
private static void |
eprintln(String s)
Used as shorthand for System.err.println . |
(package private) static void |
findCycMatches(String verb,
String vnSyntax,
String vnSem,
int vnFrameID,
String vnFrameDesc)
Searches the Cyc rules for a match to the verb-frame pair currently being scanned by the Inspector. |
(package private) static void |
initialize()
Initializes some class variables and prints the header. |
private static void |
loadCycRules()
Loads the Cyc rules stored in the compressed file into class-level data structures. |
(package private) static void |
loadExternalData()
Loads all the Cyc rules from the Cyc input file and the manual matches from the manual mapping file. |
private static void |
loadManualFile()
Loads the manual matches stored in the manual mapping file into the list of manual matches MM . |
private static void |
print(String s)
Used as shorthand for System.out.print . |
(package private) static void |
printClassMatchCount(int level)
Prints the number of matches found by the automatic matching algorithm for this class or subclass. |
private static void |
printHeader()
Prints the header for this execution of the VxC mapper. |
private static void |
println(String s)
Used as shorthand for System.out.println . |
private static int |
roundPct(double d)
Returns as an integer the first two significant figures of a floating point number between 0 and 1, inclusive. |
(package private) static void |
setClass(String newClassName)
Sets the current class or subclass that the Inspector is scanning. |
private static void |
setUpMatchFile()
Prepares the match output file to be written to. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
static String flags
Inspector.analyzeArguments(String[])
,
printHeader()
static boolean flShowDiscards
Inspector.analyzeArguments(String[])
and
used to indicate that the user would like the VxC system to
print a line of text to stderr for each possible match that was
discarded. This line of text includes the reason for discarding.
This is shown by displaying the constraint symbol corresponding
to the constraint method which signalled to discard the possible match.
Remember that constraints are applied in the order they appear
in the -A operator's argument. Therefore, the reason for discard
could be different based on different arrangements of this arugment
(i.e. pti vs. itp).
Inspector.analyzeArguments(String[])
,
findCycMatches(String, String, String, int, String)
static boolean flManualClassesOnly
Inspector.analyzeArguments(String[])
and
used to indicate that the user would like the VxC system that
classes not covered by the manual mapping should be ignored (this will save you time)
Most of the final results are based off of the manual mapping.
Sometimes it makes sense to ignore all other classes, if you're
just looking to gauge performance of the matching algorithm.
Inspector.analyzeArguments(String[])
,
setClass(String)
,
findCycMatches(String, String, String, int, String)
static boolean flShowClassMatchCounts
Inspector.analyzeArguments(String[])
and
used to indicate that the user would like the VxC system that the
number of good matches found in each class and each subclass should
be printed. This printing of a class's or subclass's match count
takes place right before a new class begins (or when the </FRAMES>
tag is reached). The sum of all these counts will be equal to
[Correct Matches] + [Incorrect Matches]
in the final
results section.
Inspector.analyzeArguments(String[])
,
printClassMatchCount(int)
static File cycFile
verbSemTrans
rules from Cyc.
One is provided in the vxc.tar.gz
download file for VxC.
This file was obtained by performing these steps:
verbSemTrans
constantWordWithPrefixFn
or WordWithPrefixFn
(verbSemTrans *-MWW
- rules with multi-words as first predicate argument[False](not
- negated rulesInspector.analyzeArguments(String[])
,
printHeader()
,
addRuleNumbersAndCompress()
static File cycFileCmp
cycFile
).
When the data is extracted from Cyc, each rule spans multiple lines.
The method addRuleNumbersAndCompress()
removes newline
characters in order to create a Cyc rule file with one rule per line.
This file is always named FILE.vxc.compressed where FILE is the name
of the Cyc input file provided on the command line (-C operator). The compressed
file is placed into the directory where the java vn.Inspector
command was executed.
Inspector.analyzeArguments(String[])
,
printHeader()
,
addRuleNumbersAndCompress()
,
loadCycRules()
private static ArrayList[] cycRules
ArrayList
object. Each rule is added to the array
that corresponds to the rule's verb. This is done to decrease
access time when checking to see if a VerbNet syntax matches
any Cyc rules. Each of the 26 arrays are held in unsorted order.
There exist many other ways to store these rules so as to achieve
even faster access times.
loadCycRules()
,
findCycMatches(String, String, String, int, String)
static File manFile
vxc.tar.gz
download file
for VxC.
Inspector.analyzeArguments(String[])
,
printHeader()
,
loadManualFile()
private static ArrayList AM
Cyc.Match
object is added to this array
every time the automatic matching algorithm has no
reason, based on the desired constraints, to discard
the match.
initialize()
,
findCycMatches(String, String, String, int, String)
,
compareMatches()
private static ArrayList MM
Cyc.Match
object added to this array
for each
initialize()
,
loadManualFile()
,
compareMatches()
private static ArrayList manClasses
setClass(String)
,
initialize()
,
loadManualFile()
,
compareMatches()
private static int numAMDiscardsInManClasses
findCycMatches(String, String, String, int, String)
,
compareMatches()
private static String manualVersion
loadManualFile()
,
setUpMatchFile()
static File matchFile
Inspector.analyzeArguments(String[])
,
printHeader()
,
setUpMatchFile()
private static PrintWriter matchpw
PrintWriter
object tied to the match
output file.
setUpMatchFile()
,
closeExternalData()
,
findCycMatches(String, String, String, int, String)
static final String ALL_MATCH_CONSTRS
Cyc
class's description for an explanation
of each constraint.
Inspector.analyzeArguments(String[])
,
Constant Field Valuesstatic String matchConstraints
Inspector.analyzeArguments(String[])
,
printHeader()
,
findCycMatches(String, String, String, int, String)
private static String curClass
setClass(String)
,
printClassMatchCount(int)
,
findCycMatches(String, String, String, int, String)
private static boolean isManualClass
setClass(String)
,
findCycMatches(String, String, String, int, String)
,
findCycMatches(String, String, String, int, String)
private static int classMatchCount
AM
. This is used only to implement the
-Sc flag (flShowClassMatchCounts
).
setClass(String)
,
printClassMatchCount(int)
,
findCycMatches(String, String, String, int, String)
Constructor Detail |
---|
private Cyc()
Method Detail |
---|
private static void println(String s)
System.out.println
.
s
- the string to print to stdout, followed by a carriage returnPrintStream.println(String)
private static void print(String s)
System.out.print
.
s
- the string to print to stdoutPrintStream.print(String)
private static void eprintln(String s)
System.err.println
.
s
- the string to printPrintStream.println(String)
private static void eprint(String s)
System.err.print
.
s
- the string to printPrintStream.print(String)
static void setClass(String newClassName)
newClassName
- the new class name ('ID' attribute from VerbNet XML files)EventManager.fireEvent(int, int, String, String, String, Element, Element)
,
isManualClass
,
manClasses
static void printClassMatchCount(int level)
flShowClassMatchCounts
is false
.
EventManager.fireEvent(int, int, String, String, String, Element, Element)
static void initialize()
EventManager.fireEvent(int, int, String, String, String, Element, Element)
,
printHeader()
private static void printHeader()
Inspector.filePath(File)
,
initialize()
static void loadExternalData()
EventManager.fireEvent(int, int, String, String, String, Element, Element)
static void closeExternalData()
EventManager.fireEvent(int, int, String, String, String, Element, Element)
private static void addRuleNumbersAndCompress() throws Cyc.InvalidCycInputFileException, IOException
java vn.Inspector
was invoked.
Cyc.InvalidCycInputFileException
IOException
loadExternalData()
private static void loadCycRules() throws IOException
IOException
loadExternalData()
,
cycRules
private static void setUpMatchFile() throws IOException
IOException
loadExternalData()
,
findCycMatches(String, String, String, int, String)
private static void loadManualFile() throws Exception
MM
. Loads the
names of all the classes covered by the manual mapping file into
the list of manual mapping classes manClasses
.
This method assumes there are both <tuple> and <tuple-m> (maybe tuples)
elements in the manual mapping file. This method considers both
of them to be good matches. Only the final results section makes
a distinction when printing out those matches which the automatic
matching algorithm failed to match by printing the word 'maybe'
next to a match if it was so identified in the manual mapping file.
Exception
loadExternalData()
,
MM
,
manClasses
private static String cycExtractVerb(String rule)
RULE #2244 [Def](verbSemTrans Jump-TheWord 0 IntransitiveVerbFrame ...
rule
- the Cyc rule to examine
loadCycRules()
,
findCycMatches(String, String, String, int, String)
static boolean cycHasParticle(String rule)
ParticleCompFrameFn
.
rule
- the Cyc rule to examine
static String cycExtractParticle(String rule)
rule
- the Cyc rule to examine
static boolean cycHasPreposition(String rule)
PPCompFrameFn
. The Cyc rule that specifies
a preposition in this matter is saying that the Cyc rule is only valid when
the verb is used with the given preposition.
rule
- the Cyc rule to examine
Matcher.cycPreposition(String, String)
static String cycExtractPreposition(String rule)
[Def](verbSemTrans Bicker-TheWord 0 (PPCompFrameFn TransitivePPFrameType With-TheWord) ...
rule
- the Cyc rule to examine
Matcher.cycPreposition(String, String)
static String cycExtractRuleNum(String rule)
addRuleNumbersAndCompress()
.
rule
- the Cyc rule to examine
addRuleNumbersAndCompress()
,
findCycMatches(String, String, String, int, String)
static String cycExtractTrans(String rule)
T - TransitiveNPFrame
I - IntransitiveVerbFrame
M - Middle Voice Frame
TGPF - TransitiveGerundPhraseFrame
TIPF - TransitiveInfinitivePhraseFrame
DDPFT - DitransitivePPFrameType
?
- any other type
rule
- the Cyc rule to examine
Matcher.transitivity(String, String)
static void findCycMatches(String verb, String vnSyntax, String vnSem, int vnFrameID, String vnFrameDesc)
EventManager
class. Not all of the Cyc rules are
searched. In order to increase the speed of this method,
the Cyc rules are broken apart into separate lists based on
the first letter of their verb. When this method is invoked,
it only scans the list corresponding to the first letter
of the VerbNet verb in question (there are even more efficient
ways of implementing this).
AM
.
verb
- the verb from the VerbNet verb-frame pairvnSyntax
- the syntax from the VerbNet verb-frame pairvnSem
- the semantics from the VerbNet verb-frame pair. This is a conglomeration
of all the semantic predicates in one string.vnFrameID
- the frame number from the VerbNet verb-frame pairvnFrameDesc
- the primary and secondary frame descriptions from the VerbNet verb-frame paircycRules
,
findCycMatches(String, String, String, int, String)
,
EventManager.fireEvent(int, int, String, String, String, Element, Element)
,
Matcher
static void compareMatches()
AM
,
MM
, and numAMDiscardsInManClasses
.
Final results are displayed to stdout and consist of three parts:
EventManager.fireEvent(int, int, String, String, String, Element, Element)
private static int roundPct(double d)
d
- the floating point value to round and convert to a percentage
compareMatches()
|
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |