History :
Documentation became a part of our culture ever since the
written word was invented. Documentation as we all know, is the simplest method
of allowing understanding and referencing. The methods of documentation have of
course evolved over the years along with the formats in which the data
was stored. So also, data formats have been around for as long as
computing. They reflected the varying capabilities and functions of different
computing systems and have evolved as these computing systems have evolved. In
the decades since, a wide range of formats (TXT, PDF, HTML, and DOC, just to
name a few) became popular because they meet specific user needs and tap into
new computing capabilities as they evolved. Then came the increasing
expectations and demands and technology met them by changing at a scorching
pace. Advances were being made in the field literally on a day-to-day basis, to
the extent that redundancy actually became an inbuilt attribute.
With such advances and the passage of time, the ones who
don’t match the pace, fade away in the dark corners of technological redundancy.
Many of us have experienced disappearance of older formats. For instance : Punch
cards were once commonplace, but you wouldn’t think of using them today.
WordStar was once what everyone used as their word processor; now, even filters
to read the format are less and less common. (More closer to heart Tally 4.5 to
Tally 9, Windows 3.11 to Vista and so on so forth). Luckily, WordStar format is
similar to ASCII and is thus mostly recoverable. But there are times when I
can’t read some important PowerPoint 4 files in today’s PowerPoint, only 7 years
later. This has come to a point that a file you created in a software less than
half a decade ago is no longer usable. This because the software/application no
longer accepts (supports) it.
Today
Then why should it be different for your documents ? You
should be able to send your documents to your customers without knowing what
office software they run and be confident that it would work. Have you ever had
trouble opening a document that someone sent you ? Have you ever bought a copy
of an application software that you didn’t want because you have to read
documents that only work with that version of an application software ? Have you
ever wondered why there is so little choice in office software ?
This is where the OpenDocument Format (ODF), an open,
XML1-based file format for office documents comes into the picture.
OpenDocuments include text documents, spreadsheets, drawings, presentations and
more. An OpenDocument is freely available for any software maker to use and
implement and does not favour any vendor over all the others. The creation of
XML-based document formats continues this evolution, and even within this
category a number of formats are being developed, including ODF2, Open XML3 and
UOF4. We should expect the creation of new formats in the future as the
technology evolves, and, as has always been the case, users should be able to
choose the formats that work best for them.
Recent developments :
One objective of open formats like OpenDocument is to
guarantee long-term access to data without legal or technical barriers, and some
governments have come to view open formats as a public policy issue.
OpenDocument is intended to be an alternative to proprietary formats,
including the commonly used DOC, XLS, and PPT formats used by
Microsoft Office and other applications. Up until Feb. 15th 2008, these
latter formats did not have documentation available for download, and were only
obtainable by writing directly to Microsoft Corporation and signing a
restrictive non-disclosure agreement. As of Feb. 15th 2008, Microsoft
offers documents for download claiming to accurately specify the aforementioned
document formats (although this claim hasn’t been independently verified yet).
Microsoft is supporting the creation of a plug-in for Office to allow it to use
OpenDocument. The OpenDocument Foundation, Inc. has created a similar
plug-in that will allow continued use of Microsoft Office.
The OpenDocument format (ODF, ISO/IEC 26300, full name :
OASIS Open Document Format for Office Applications) is a free and open file
format for electronic office documents, such as spreadsheets, charts,
presentations and word processing documents. While the specifications were
originally developed by Sun, the standard was developed by the Open Office XML
technical committee of the Organisation for the Advancement of Structured
Information Standards (OASIS) consortium and based on the XML format originally
created and implemented by the OpenOffice.org office suite (see OpenOffice.org
XML).
Case for the Governments to adopt open document formats:
Case for the Governments to adopt open document formats:
In all humility, with whatever limited knowledge I have about technology and of the trends that are taking shape, I am now getting paranoid about the whole e filing process and the initiatives adopted by the Government.
Although the process was in bits and pieces (fits and start is more like it), the process adopted by the Government has been rather haphazard. Instead of learning from each other’s experience, every department has tried to do their “own thing”.
For instance : the e-filing process was kicked off by the Government in 2004. At the time text files were in vogue (still is with the etds process), then came the PDF- (MCA 21 and ITRs for Corp orates in AY 06-07). The last year it was Excel and XML and the story will go on.
This year the Government is pushing for efiling not only for Income Tax, but also for Service Tax, VAT . and other laws. Even here, there is no uniformity. The Income-tax Department is using XML format, VAT authorities seem to be following suit, but the Excise & Service Tax authorities are still depending on an HTML format (EASIEST), the MCA relies on the PDF format.
The concern stems from the fact that governments don’t create office documents, so that they can be tossed in the shredder. They often have to be accessible decades (or centuries) later, and many of them – have to be accessible to any citizen, regardless of what equipment they use or will use. Having said this, the question that needs to be answered is has the Government given a serious thought to the fact that although, PDF is a very useful display format, it has a different purpose – while it’s great at preserving formatting, it doesn’t let you edit the data meaningfully. HTML is great for web pages, or short, but it’s just not capable enough for data mining and data retrieval. Both HTML and PDF will continue to be used, but they cannot be used as a complete replacement.
The writing on the wall suggests that the taxpayer, along with dealing with the many intricacies in law, will now be saddled with the additional burden of dealing with multiple data formats. Nobody knows what will happen 5-7 years down the line when presumably better formats are in vogue. Unless the Government realises the pitfalls and makes conscientious efforts in developing/adopting standardised/ open standard software, we will all have to save our old software packages and the files generated thru them, on floppies/CDs/DVD, etc. and pray that they still work when the sleeping giant wakes up.