SparkNotes chapter summaries compiler

illustrations illustrations illustrations illustrations

Published on 7th April 2021 by Madeleine Smith

Background

I built this website a few years back after struggling through Dickens’s ‘A Tale of Two Cities’. I always find Dickens’s English pretty hard to understand and frequently have no idea what’s going on 😆. As a result, whilst I was reading this book, I would also go to SparkNotes (remember that site?) and read the summaries of each chapter.

After a while, I wanted to have a document of all the chapter summaries on my Kindle so I could read a chapter on my Kindle and then easily flip to the summary. Initially, I was compiling this document by hand and then manually emailing it to my Kindle. However, it would take an age to copy/paste all the summaries into a Word doc, and I knew I could write software to do this job for me:

For the front-end, I built this out using Vue and hosted it on Amazon’s S3.

I decided to use Python for the back-end, as I knew it has a large selection of libraries for web scraping. And I had also wanted to learn the language as I hadn’t then had the opportunity to use it in my professional career. I hosted the back-end on Heroku for simplicity.


Implementation

On submit of the application, a job is created in a queue (using the package Redis Queue) which starts the whole SparkNotes scraping process. I found it necessary to use a queue as the scraping process takes (on my machine) about 10 seconds. So I needed to do this work in the background as to not block the request-response cycle.

The front-end then polls the back-end at one second intervals to get the status of the job and updates the progress bar to reflect this.

On successful completion of the job, the user is presented with two options - to download the file or email it to a Kindle. For the download step, I’m converting the HTML document generated to a .docx file using Pandoc and then sending this file to the client.

For the ‘email to Kindle’ step, I’m first converting the file to a MOBI file with KindleGen (which no longer seems to exist haha). Then I’m using Flask-Mail to email out this file from a test Gmail account. However, in the 4 years since I built this website, Google has turned off access to less secure apps. Meaning that the original email functionality no longer works 😔. If I had more time (haha), I’d re-work this email functionality into using a package like Sendgrid instead.

Conclusion

Overall, this was an interesting project to work on and if I ever plan to pick up some Dickens again, I know where to turn. Check it out here.


In need of a back-end engineer for your project? Get in touch to hire me for contract work 💯