WSU talk: info on corpora and tech that will be discussed
10/07/12 09:26 Filed in: Info
I’ll give a talk on corpora and relevant technologies at Wayne State University in Detroit on the 19th of October at 11 AM. Here are some links, papers and slides that might be interesting for colleagues and students to follow and post process:
Read More...
Read More...
Talk: Piotr Banski "TEI XML for Linguists"
04/18/12 21:20 Filed in: Info
Please join us for a talk by:
Dr. Piotr Banski (Institute for German Language/Institut fuer Deutsche Sprache, Mannheim, Germany)
Title: "TEI XML for Linguists"
Time: Friday, April 20, 2012 at 2:00 pm
Location: Suite 104, Cooper Building, on the Eastern Michigan University campus (see Google maps)
Read More...
Dr. Piotr Banski (Institute for German Language/Institut fuer Deutsche Sprache, Mannheim, Germany)
Title: "TEI XML for Linguists"
Time: Friday, April 20, 2012 at 2:00 pm
Location: Suite 104, Cooper Building, on the Eastern Michigan University campus (see Google maps)
Read More...
TEI online converter: OxGarage Converter
03/08/12 12:35 Filed in: Corpus Linguistics
The online OxGarage Converter on the TEI pages converts almost anything to something else, in particular to TEI XML. This is obviously using the OpenOffice filters and converters in the backend as batch processors, as described here for the manual conversion.
Read More...
Read More...
Text analyzed and parsed to TEI XML wrapper
02/24/12 21:23 Filed in: Info
I set up a simple testing page for a wrapper of raw text to TEI XML. It uses in this version just the Stanford CoreNLP tools to tokenize, recognize sentences, part of speech annotate and lemmatize the input. Just paste a paragraph of text in there. In the next version this will be expanded with NLP tools for a couple of more languages, as well as other analysis components and tools for English.
Read More...
Read More...
Intensive Python class for Linguists (for corpuslinguistics, language data processing and manipulation etc.)
11/16/11 17:55 Filed in: Info
I am offering an intensive class for the LING519 students, all the Linguist List people, and whoever might be interested, this Saturday 19th of Nov. 2011 at 10 AM Eastern Time in Cooper, the LinguistList Suite. We plan to meet for 4 hours or more, depending on speed and interest. Let me know, if you are interested. If you want to join us, let me know. I will share the screen and the audio already with Zadar, we can include you, if you cannot come. The topics covered might be:
Intro to Python 3
Using Komodo Edit 6.x
Processing corpora like the Brown corpus (raw text with slash-pos, or TEI XML), the Penn Treebank, the Croatian Language Corpus etc.
Generating statistical models and profiles: frequency profiles, N-gram models
Calculating significance, mutual information, relative entropy, …
Simple Finite State Machines
Simple Parsers
Generating outputs of analyses: CSV, HTML, XML, etc.
…
DC
Read More...
Intro to Python 3
Using Komodo Edit 6.x
Processing corpora like the Brown corpus (raw text with slash-pos, or TEI XML), the Penn Treebank, the Croatian Language Corpus etc.
Generating statistical models and profiles: frequency profiles, N-gram models
Calculating significance, mutual information, relative entropy, …
Simple Finite State Machines
Simple Parsers
Generating outputs of analyses: CSV, HTML, XML, etc.
…
DC
Read More...
TEI XML export in OpenOffice again...
07/29/10 07:45 Filed in: TEI
Since the course pages went away somewhere, here again a summary of how to export some document in for example the Word, OpenOffice (LibreOffice, NeoOffice), RTF or other type of document quickly to TEI XML P5.
Read More...
Read More...