UofT Scrapers
This is a library of scrapers for various University of Toronto websites. It is built to generate up-to-date databases for Cobalt, but is distributed as a stand-alone library for anyone to utilize.
Table of Contents
- Requirements
- Installation
- Usage
- Library Reference
Requirements
Installation
pip install uoftscrapers
Usage
import uoftscrapers
# Example: scrape http://map.utoronto.ca building data to ./some/path
uoftscrapers.Buildings.scrape('./some/path')
# Example: scrape http://coursefinder.utoronto.ca to current working directory
uoftscrapers.Courses.scrape()
Library Reference
Courses
Class name
uoftscrapers.Courses
Scraper source
http://coursefinder.utoronto.ca
Output format
{
"id": String,
"code": String,
"name": String,
"description": String,
"division": String,
"department": String,
"prerequisites": String,
"exclusions": String,
"level": Number,
"campus": String,
"term": String,
"breadths": [Number],
"meeting_sections": [{
"code": String,
"instructors": [String],
"times": [{
"day": String,
"start": Number,
"end": Number,
"duration": Number,
"location": String
}],
"size": Number,
"enrolment": Number
}]
}
Buildings
Class name
uoftscrapers.Buildings
Scraper source
Output format
{
"id": String,
"code": String,
"name": String,
"short_name": String,
"campus": String,
"address": {
"street": String,
"city": String,
"province": String,
"country": String,
"postal": String
},
"lat": Number,
"lng": Number,
"polygon": [
[Number, Number]
]
}
Textbooks
Class name
uoftscrapers.Textbooks
Scraper source
Output format
{
"id": String,
"isbn": String,
"title": String,
"edition": Number,
"author": String,
"image": String,
"price": Number,
"url": String,
"courses":[{
"id": String,
"code": String,
"requirement": String,
"meeting_sections":[{
"code": String,
"instructors": [String]
}]
}]
}
Food
Class name
uoftscrapers.Food
Scraper source
Output format
{
"id": String,
"building_id": String,
"name": String,
"short_name": String,
"description": String,
"url": String,
"tags": [String],
"image": String,
"campus": String,
"lat": Number,
"lng": Number,
"address": String,
"hours": {
"sunday": {
"closed": Boolean,
"open": Number,
"close": Number
},
"monday": {
"closed": Boolean,
"open": Number,
"close": Number
}
"tuesday": {
"closed": Boolean,
"open": Number,
"close": Number
},
"wednesday": {
"closed": Boolean,
"open": Number,
"close": Number
},
"thursday": {
"closed": Boolean,
"open": Number,
"close": Number
},
"friday": {
"closed": Boolean,
"open": Number,
"close": Number
},
"saturday": {
"closed": Boolean,
"open": Number,
"close": Number
}
}
}
Calendar
Class name
uoftscrapers.Calendar
Scraper source
Output format
Not implemented.
UTSG Calendar
Class name
uoftscrapers.UTSGCalendar
Scraper source
http://www.artsandscience.utoronto.ca/ofr/calendar/
Output format
Refer to Calendar
UTM Calendar
Class name
uoftscrapers.UTMCalendar
Scraper source
https://student.utm.utoronto.ca/calendar/calendar.pl
Output format
Refer to Calendar
UTSC Calendar
Class name
uoftscrapers.UTSCCalendar
Scraper source
http://www.utsc.utoronto.ca/~registrar/calendars/calendar/index.html
Output format
Refer to Calendar
Timetable
Class name
uoftscrapers.Timetable
Scraper source
Output format
{
"id": String,
"code": String,
"name": String,
"description": String,
"division": String,
"department": String,
"prerequisites": String,
"exclusions": String,
"level": Number,
"campus": String,
"term": String,
"breadths": [Number],
"meeting_sections": [{
"code": String,
"instructors": [String],
"times": [{
"day": String,
"start": Number,
"end": Number,
"duration": Number,
"location": String
}],
"size": Number,
"enrolment": Number
}]
}
UTSG Timetable
Class name
uoftscrapers.UTSGTimetable
Scraper source
https://timetable.iit.artsci.utoronto.ca
Output format
Refer to Timetable
UTM Timetable
Class name
uoftscrapers.UTMTimetable
Scraper source
https://student.utm.utoronto.ca/timetable
Output format
Refer to Timetable
UTSC Timetable
Class name
uoftscrapers.UTSCTimetable
Scraper source
http://www.utsc.utoronto.ca/~registrar/scheduling/timetable
Output format
Refer to Timetable
Exams
Class name
uoftscrapers.Exams
Scraper source
Output format
{
"id": String,
"course_id": String,
"course_code": String
"period": String,
"date": String,
"start_time": Number,
"end_time": Number,
"duration": Number,
"sections": [{
"lecture_code": String,
"exam_section": String,
"location": String
}]
}
UTSG Exams
Class name
uoftscrapers.UTSGExams
Scraper source
http://www.artsci.utoronto.ca/current/exams
Output format
Refer to Exams
UTM Exams
Class name
uoftscrapers.UTMExams
Scraper source
https://student.utm.utoronto.ca/examschedule/finalexams.php
Output format
Refer to Exams
UTSC Exams
Class name
uoftscrapers.UTSCExams
Scraper source
http://www.utsc.utoronto.ca/registrar/examination-schedule
Output format
Refer to Exams
Athletics
Class name
uoftscrapers.Athletics
Scraper source
Output format
{
"date": String,
"events":[{
"title": String,
"campus": String,
"location": String,
"building_id": String,
"start_time": Number,
"end_time": Number,
"duration": Number
}]
}
UTSG Athletics
Class name
uoftscrapers.UTSGAthletics
Scraper source
Not yet implemented
Output format
Refer to Athletics
UTM Athletics
Class name
uoftscrapers.UTMAthletics
Scraper source
http://www.utm.utoronto.ca/athletics/schedule/month/
Output format
Refer to Athletics
UTSC Athletics
Class name
uoftscrapers.UTSCAthletics
Scraper source
http://www.utsc.utoronto.ca/athletics/calendar-node-field-date-time/month/
Output format
Refer to Athletics
Parking
Class name
uoftscrapers.Parking
Scraper source
Output format
{
"id": String,
"title": String,
"building_id": String,
"campus": String,
"type": String,
"description": String,
"lat": Number,
"lng": Number,
"address": String
}
Shuttles
Class name
uoftscrapers.Shuttles
Scraper source
https://m.utm.utoronto.ca/shuttle.php
Output format
{
"date": String,
"routes": [{
"id": String,
"name": String,
"stops": [{
"location": String,
"building_id": String,
"times": [{
"time": Number,
"rush_hour": Boolean,
"no_overload": Boolean
}]
}]
}]
}
Events
Class name
uoftscrapers.Events
Scraper source
https://www.events.utoronto.ca/
Output format
{
id: String,
title: String,
start_date: String
end_date: String,
start_time: Number,
end_time: Number,
duration: Number,
url: String,
description: String,
admission_price: String,
campus: String,
location: String,
audiences: [String],
}
Libraries
Class name
uoftscrapers.Libraries
Scraper source
https://onesearch.library.utoronto.ca/
Output format
{
id: String,
name: String,
image: String,
website: String,
address: String,
phone: String,
about: String,
collection_strengths: String,
access: String,
hours: {
sunday: {
closed: Boolean,
open: String,
close: String,
},
monday: {
closed: Boolean,
open: Number,
close: Number,
},
tuesday: {
closed: Boolean,
open: Number,
close: Number,
},
wednesday: {
closed: Boolean,
open: Number,
close: Number,
},
thursday: {
closed: Boolean,
open: Number,
close: Number,
},
friday: {
closed: Boolean,
open: Number,
close: Number,
},
saturday: {
closed: Boolean,
open: Number,
close: Number,
}
}
}
Dates
Class name
uoftscrapers.Dates
Scraper source
Output format
{
"date": String,
"events": [{
"end_date": String,
"session": String,
"campus": String,
"description": String
}]
}
UTSG Dates
Class name
uoftscrapers.UTSGDates
Scraper source
http://www.artsci.utoronto.ca/current/course/timetable/ http://www.undergrad.engineering.utoronto.ca/About/Dates_Deadlines.htm
Output format
Refer to Exams
UTM Dates
Class name
uoftscrapers.UTMDates
Scraper source
http://m.utm.utoronto.ca/importantDates.php
Output format
Refer to Exams