Proceedings
The proceedings of KONVENS 2016, as well as the associated workshop NLP4CMC III, are published via the online publication series Bochumer Linguistische Arbeitsberichte (BLA). On this page, you can additionally download files for individual papers.
Invited talks
- John Nerbonne: Invited Talk: Data from Non-standard Varieties
- Barbara Plank: Invited Talk: What to do about non-standard (or non-canonical) language in NLP
Papers
- Adrien Barbaresi: Bootstrapped OCR error detection for a less-resourced language variant
- Fabian Barteld, Ingrid Schröder and Heike Zinsmeister: text-gamma – Inter-annotator agreement for categorization with simultaneous segmentation and transcription-error correction
- Dolores Batinić, Sandra Birzer and Heike Zinsmeister: Creating an extensible, levelled study corpus of Russian
- Reto Baumgartner: Morphological analysis and lemmatization for Swiss German using weighted transducers
- Kay Berkling: Item Presentation in Primers – An Analysis Based on Acquisition Research
- Simon Clematide, Karina Frick, Noëmi Aepli and Jean-Philippe Goldman: Crowdsourcing Swiss Dialect Transcriptions for Assessing Factors in Writing Variations
- Daniel Dakota: Brown clustering for unlexicalized parsing
- Carlota de Benito Moreno, Javier Pueyo and Inés Fernández-Ordóñez: Creating and designing a corpus of rural Spanish
- Mirela-Stefania Duma and Wolfgang Menzel: Paragraph Vector for Data Selection in Statistical Machine Translation
- Kerstin Eckart and Markus Gärtner: Creating Silver Standard Annotations for a Corpus of Non-Standard Data
- Maud Ehrmann, Giovanni Colavizza, Yannick Rochat and Frédéric Kaplan: Diachronic Evaluation of NER Systems on Old Newspapers
- Timo Gühring, Nicklas Linz, Rafael Theis and Annemarie Friedrich: SWAN: an easy-to-use web-based annotation system
- Sviatlana Höhn, Alain Pfeiffer and Eric Ras: Challenges of error annotation in native/non-native speaker chat
- Eva Horch and Ingo Reich: On “Article Omission” in German and the “Uniform Information Density Hypothesis”
- Gerhard Jäger and Pavel Sofroniev: Automatic cognate classification with a Support Vector Machine
- Christine Köhn, Tobias Staron and Arne Köhn: Parsing Free-Form Language Learner Data: Current State and Error Analysis
- Nikola Ljubešić, Katja Zupan, Darja Fišer and Tomaž Erjavec: Normalising Slovene data: historical texts vs. user-generated content
- Harald Lüngen, Michael Beißwenger, Eric Ehrhardt, Axel Herold and Angelika Storrer: *Integrating corpora of computer-mediated communication in CLARIN-D: Results from the curation project ChatCorpus2CLARIN
- Jose Manuel Martinez Martinez, Ekaterina Lapshinova-Koltunski and Kerstin Kunz: Annotation of Lexical Cohesion in English and German: Automatic and Manual Procedures
- Rahel Oppliger: Automatic authorship attribution based on character n-grams in Swiss German
- Sebastian Padó, Jan Šnajder, Jason Utt and Britta D. Zeller: Smoothing Syntax-Based Semantic Spaces: Let The Winner Take It All
- Alexander Panchenko, Johannes Simon, Martin Riedl and Chris Biemann: Noun Sense Induction and Disambiguation using Graph-Based Distributional Semantics
- Polina Panicheva and Olga Mitrofanova: Developing a Toolkit for Distributional Analysis of Abnormal Collocations in Russian
- Michael Percillier: Verb lemmatization and semantic verb classes in a Middle English corpus
- Martin Riedl, Tim Feuerbach and Chris Biemann: Running into Brick Walls Attempting to Improve a Simple Unsupervised Parser
- Tobias Roth: Isolation and Mapping of Place-Name Forms in Toponymic Data
- Josef Ruppenhofer and Jasper Brandes: Verifying the robustness of opinion inference
- Dietmar Schabus, Brigitte Krenn and Friedrich Neubarth: Data-Driven Identification of Dialogue Acts in Chat Messages
- Tatjana Scheffler and Manfred Stede: Mapping PDTB-style connective annotation to RST-style discourse annotation
- Yves Scherrer and Nikola Ljubešić: Automatic normalisation of the Swiss German ArchiMob corpus using character-level machine translation
- Gerold Schneider, Marianne Hundt and Rahel Oppliger: Part-Of-Speech in Historical Corpora: Tagger Evaluation and Ensemble Systems on ARCHER
- Fabian Simonjetz and Adam Roussel: Crosslinguistic Annotation of German and English Shell Noun Complexes
- Julia Suter, Sarah Ebling and Martin Volk: Rule-based Automatic Text Simplification for German
- Martin Volk, Chantal Amrhein, Noëmi Aepli, Mathias Müller and Phillip Ströbel: Building a Parallel Corpus on the World’s Oldest Banking Magazine
- Martin Volk, Simon Clematide, Johannes Graën and Phillip Ströbel: Bi-particle Adverbs, PoS-Tagging and the Recognition of German Separable Prefix Verbs
- Daniel Weber and Desislava Zhekova: TweetNorm: Text Normalization on Italian Twitter Data
- Michael Wojatzki and Torsten Zesch: Stance-based Argument Mining – Modeling Implicit Argumentation Using Stance
- Inès Zribi, Inès Kammoun, Mariem Ellouze, Lamia Hadrich Belguith and Philippe Blache: Sentence Boundary Detection for Transcribed Tunisian Arabic