pdf: move parsing into it's own class, rewrite the parser