Resources

Tutorials

  • Creating a Personalized Vowel Chart with PRAAT and R
    This tutorial exemplifies how to create a personalized vowel chart with PRAAT and R. The data can be downloaded here, the RP reference data here, and the full R script is available here.
  • A sociolinguistic analysis in R – curse word use in Irish English
    This tutorial exemplifies how to perform a simple corpus analysis with R. The example we are going to deal with is answering the questions of whether males or females use more curse words in Irish English and whether curse word use is related to the age of speakers.
  • Text Mining with R: Building a Text Classifier
    This tutorial exemplifies how to create a text classifier with R, i.e. it will implement a machine-learning algorithm, which classifies texts as being either a speech by Barack Obama or Mitt Romney. The script is based on Timothy DAuria’s YouTube tutorial „How to Build a Text Mining, Machine Learning Document Classification System in R!“ (https://www.youtube.com/watch?v=j1V2McKbkLo). The data is available here and the code for downloading the speeches is available here.
  • Part-of-Speech Tagging with R
    This tutorial exemplifies how to tag a corpus with R. Part-of-Speech tagging, or POS tagging, is a form of annotating text in which POS tags are assigned to lexical items.
  • (Syntactic) Parsing in R
    This tutorial exemplifies how to syntactically parse a corpus with R.
  • Simple Linear Regression – a practical example : preposition use over time
    This tutorial exemplifies how to implement a simple Linear Regression in R (the R code is based on Field, Miles, and Field 2012).
  • Plotting with R
    This tutorial exemplifies how to visualize data in R.
  • Writing Functions In R
    This tutorial exemplifies how to how to write and call functions in R.
  • Two-Layer Configural Frequency Analysis
    This tutorial exemplifies how to implement a CFA (configural frequency analysis) with only two layers or configurations in R.

For students

  • Merkblatt für Seminare
    You will find a documents with general information about my seminars here. Please read this document in case you are attending or plan to attend one of my seminars! (last updated 2015/02/16)
  • Modeltermpaper
    You will find a model term paper here. This model term paper includes information about the structure, content, and formatting of term papers. You can also use it as a template for your own term papers and use the formatting within the model. (last updated 2015/04/08)
  • Course Materials
    „Introduction to English Linguistics“ [sdm_download id=“469″ fancy=“0″]
    „Methods in Linguistics/Methoden der Linguistik“ [sdm_download id=“461″ fancy=“0″]

Corpus Linguistics Resources
Below you can find some scripts and data sets that are useful for analyzing natural language data.

  • R scripts
    • Chi Squared test for subtables of 2*k tables (R script)
    • Configural Frequency Analysis for data with only two level configurations (R script)
    • Syntactic parsing in R (paRsing, R script)
    • Function to plot several ggplots in one window (R script)
    • Function written by Tony Breyal for downloading text from websites (to create corpora containing web data) (R script)
    • Function providing nice summaries of simple linear regressions (R script)
    • Function providing nice summaries of multiple linear regressions (R script)
    • Function providing nice summaries of fixed-effects binomial logistic regressions linear regression (R script)
    • Function providing nice summaries of step-wise step-up model fitting of fixed-effects binomial logistic regressions linear regression (R script)
    • Function providing nice summaries of mixed-effects binomial logistic regressions linear regression (R script)
    • Function providing nice summaries of step-wise step-up model fitting of mixed-effects binomial logistic regressions linear regression (R script)
    • Function providing nice summaries of step-wise step-down model fitting of mixed-effects binomial logistic regressions linear regression (R script)
    • ConcR_2.3: an even more improved R function for concordancing with R.
    • ConcR_2.0: an even more improved R function for concordancing with R.
    • ConcR_1.2: a much improved R function for concordancing with R.
    • ConcR: an R function for concordancing with R.
    • Biodata scripts & data sets (last updated 2015/02/09)
      If you find any bugs in the code or mistakes in the results, please let me know so I can correct the scripts and update the results.

  • R for Linguists
    A first draft of some sections from one of my book projects.
    [sdm_download id=“701″ fancy=“0″]
  • Data for plotting examples
    Biodata from the Irish component of the ICE family of corpora for plotting exercises.
  • TestCorpus
    A small sample corpus for testing functions.

(last updated 2017/12/30)

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.