THE LONDON SCHOOL OF ECONOMICS AND POLITICAL SCIENCE

DATA SCIENCE: TEXT ANALYSIS USING R

Gain the ability to extract key insights from large amounts of text through the practical application of data analysis techniques.

Duration

8–10 hours per week,
self-paced learning online

Effort

8 weeks,
excluding orientation

Learning Format

Weekly modules,
flexible learning

ON COMPLETION OF THIS COURSE, YOU’LL WALK AWAY WITH:

1

An enhanced analytical skill set incorporating various text analysis techniques, from deconstructing data in Quanteda to analysing and interpreting it using R.

2

An improved ability to derive critical insights and predict trends from textual data at scale, helping your organisation remain competitive.

3

Comprehensive knowledge of the complete text analysis process, including preparing raw data, clustering and classification, and dissecting the results.

Accreditation Logo

This Data Science: Text Analysis Using R online certificate course is certified by the United Kingdom CPD Certification Service, and may be applicable to individuals who are members of, or are associated with, UK-based professional bodies. The course has an estimated 70 hours of learning.

Note: should you wish to claim CPD activity, the onus is on you. The London School of Economics and Political Science (LSE) and GetSmarter accept no responsibility, and cannot be held responsible, for the claiming or validation of hours or points.

This course is technical in nature and makes use of coding in R. Some algebraic and calculus knowledge is strongly advised, but is not required. Training in tertiary-level statistics and knowledge of a functional or object-orientated language are also advantageous. HTML is not considered a programming language in this context.

COURSE CURRICULUM

Over the duration of this online certificate course, you’ll work through the following modules:

MODULE 1
Working with textual data in R

Explore the motivations for using text analysis and gain a practical understanding of the text analysis process.

MODULE 2
Cleaning, processing, and transforming text

Learn how to ingest and inspect documents and execute processing techniques ahead of deeper analysis using Quanteda.

MODULE 3
Data visualisation and descriptive statistics for text

Discover how visual summaries and statistical explanations are used to extract key insights from textual data.

MODULE 4
Clustering methods for words and documents

Learn how to partition data according to specific groups using clustering techniques in R.

MODULE 5
Topic models

Discover how topic models can be used to identify hidden semantic structures in text.

MODULE 6
Sentiment analysis

Explore the use of sentiment analysis to identify subjective information in text.

MODULE 7
Document classification

Understand how to fit, use, and evaluate classification models for predictions.

MODULE 8
Social media analysis

Apply text analysis techniques to gain insights from social media data.

Please note that module titles and their contents are subject to change during course development.

COURSE CONVENOR

Professor Kenneth Benoit

Director of the Data Science Institute and Professor of Computational Social Science

Kenneth is director of the Data Science Institute at LSE. His research focuses on quantitative methods for processing large amounts of textual and other forms of big data – mainly political texts and social media – and the methodology behind text mining. Kenneth is the creator and co-author of several popular R packages for text analysis, including Quanteda, Spacyr, and Readtext. He has published extensively on applications of measurement and the analysis of text as data in political science, and has co-authored several books, including Party Policy in Modern Democracies, Intra-Party Politics and Coalition Governments, and Quantitative Text Analysis Using R (in progress). Kenneth is also a part-time professor in the School of Politics and International Relations at the Australian National University. He previously held positions in the Department of Political Science at Trinity College Dublin and at the Central European University (Budapest). Kenneth received his PhD in government with a specialisation in statistical methodology from Harvard University.

AN ONLINE EDUCATION THAT SETS YOU APART

This LSE online certificate course is delivered in collaboration with online education provider GetSmarter, part of edX. Join a growing community of global professionals and benefit from the opportunity to:

usp-slot-one

Gain verifiable and relevant competencies and earn invaluable recognition from a world-leading social science university, entirely online and in your own time.

usp-slot-two

Enjoy a personalised, people-mediated online learning experience created to make you feel supported at every step.

usp-slot-three

Experience a flexible but structured approach to online education as you plan your learning around your life to meet weekly milestones.

GET MORE INFORMATION

Want to know more?
Enter your information below to learn more about the LSE Data Science: Text Analysis Using R online certificate course, including receiving the course prospectus, from GetSmarter.

Please enter your first name