1. Handle lists in better way. [partly worked on, target latest by v2.0] 2. Heuristics based cleanup of damaged document content. [Looking for more test samples.] 3. Extract images. Now there has been a user request as well. [target pre v2.0] 4. Handle footnotes. 5. Improve table and short line justification handling. Ideally table columns in a single row should be separated by pipe. Short line justification needs to be adjusted to situations when tab occurs in line. A quick look into these issues suggests that logic/code will need to be reorganised to handle these. 6. Create a simple manpage, hopefully after resolving footnote and list issues. 7. Implement simple state-machine for speedup [partially worked towards it]. 8. XML parsing??? and making things more efficient. When it has matured enough, may be a C/C++ version should be looked into.