האיגוד הישראלי לטכנולוגיות שפת אנוש
الرابطة الإسرائيلية لتكنولوجيا اللغة البشرية
The Israeli Association of Human Language Technologies

Hebrew & Arabic Corpus Linguistics Infrastructure
IAHLT HTB (Hebrew Tree Bank)
• Update existing HTB:
• Valid by UD standards
• Comparable to other languages (esp. Arabic)
• Findable named mentions, consistent part-of-speech tagging
• Removal of inserted tokens – all words are sum of their segments
Prof. Amir Zeldes presents the HTB Project
A Second Wave of Gold Standard Hebrew Treebanks:
Challenges and Applications
Version controlled and available from:
https://github.com/IAHLT/UD_Hebrew
•Build new treebanks to supplement HTB:
• Current topics
• A wide range of genres
• Written and spoken data
• Larger scale data to support ‘high-resource’ lanaguag quality
IAHLT UD Hebrew Treebank
https://github.com/UniversalDependencies/UD_Hebrew-IAHLTwiki
NLP resources for Hebrew (IAHLT UD Treebank @2.3)
https://github.com/NNLP-IL/Resources