I’ve started work on a parser for the static content of dylanchords.info. It’s still a bit rough but I think it’s a decent start. I started by creating a schema based on what (I think) Eyolf was looking for. The source files were parsed in Perl and put into a database. I’ve also setup a quick and dirty codeigniter site to view the parsed data like DC1.0. It includes a nice little CRUD component to view and edit the raw/live data. Finally, I put it all up on github including the parsers, a database dump, and the codeigniter site. Here are the links:
- Schema: http://benpilkerton.org/dylanchords/_db/dc_schema.png
- CodeIgniter app: http://benpilkerton.org/dylanchords/index.php/dc/
- View/Edit parsed data: http://benpilkerton.org/dylanchords/index.php/crud
- Source: https://github.com/bpilkerton/dylanchords
Please take a look and let me know what you think. I’m sure more fields need to be added and you may want to normalize. The data/parser definitely needs work, some songs parsed better than others. The original html was largely preserved and should probably be cleaned up. Perhaps the data should be massaged manually?