On BNET: Online porn struggles for profits
BNET Business Network:
BNET
TechRepublic
ZDNet

June 1st, 2005

Office 12 defaulting to .XML file format

Posted by Dan Farber @ 11:56 pm

Categories: General, IT Management, Software Infrastructure

Tags:

msoffice12.jpgChris Capossela, who runs product management for the Office family of products, dropped by to see me to dribble out more details about the next version of Microsoft Office (currently dubbed "12"), which is due in the second half of 2006.

The important revelation, which was expected, is that some Office 12 applications (Word, Excel and Powerpoint) will use Office Open XML as the default file format. Note: Excel and Word already have XML support and related schemas for saving documents with full fidelity as XML files. The formats are industry standard XML 1.0 and the schemas are available on a royalty-free basis. As a result, developers can query what’s in a file and extract specific data or write their own compatible applications to view and manipulate the files. User can open the .XML files in any application that can read XML. "Our value is not tied to file format,  but to the user experience and quality of the software." Capossela said. Now that’s a refreshing point of view, given how in the past Microsoft has often made it difficult for others to parse the file formats.

What’s new for Microsoft is compacting the often overweight XML text files using industry standard Zip compression technology to compress and decompress the data within a document–including comments, charts and document metadata–that is segmented and stored in different components.  However, OLE objects and images are still stored as binaries.

Using Zip gets around the thorny issue of creating a binary XML to deal with file bloat. A few months ago I had a conversation with Jean Paoli, co-creator of the XML standard and senior director of XML architecture at Microsoft,  who told me that binary XML is "nonsense" From his viewpoint, it’s not possible to create a one size fits all binary XML standard to solve all the performance and size issues. "I am not negating the problems, but it’s not a matter of creating a binary," Paoli said. At that time he mentioned existing technology, such as XML-binary Optimized Packaging  (XOP) from the W3C  or using Zip. "Everybody has Zip, and XML Zips very well. For many scenarios it’s good enough" Paoli said. He also projected that by 2010, 75 percent of documents would be stored in XML format.

Using XML and Zip is not a unique approach, however, given that open-source Office competitor OpenOffice (sponsored by Sun) has been using an XML-based file format and Zip compression to store files. The OpenOffice XML file format specification is maintained by an OASIS technical committee. According to a Microsoft spokesperson,  Openoffice.org has royalty-free access to the specs for the Office Open XML formats to ensure file compatibility. The current XML filter tool in OpenOffice supports the Microsoft Office 2003 XML file formats, although not always with full fidelity.
 

According to Capossela, users won’t notice any difference with the compressing and uncompressing of files, and file size will be reduced 50 to 75 percent, resulting in savings on bandwidth and network storage.  The file formats will be backwards compatible with Office 2000 and Microsoft will have tools to bulk convert files. None of the preexisting file formats are going away either.

One of the unique benefits of .XML, beside enabling more fluid intereoperability with data and applications outside of Office,  is that the XML-based file format improves data recovery of corrupted files because it saves different types of data and puts them into discrete components. Instead of corrupting an entire file, only a part of it would be damaged. The XML formats will also help prevent executable payloads, such as viruses, from being delivered inappropriately in files.

A preview of Office 12 (not an initial beta, which isn’t due until the fall) will be available at www.microsoft.com/office/preview on Monday, June 6. I asked about XML file formats for Macintosh Office,  but Capossela wasn’ sure–Mac Office is done by a different business group at Microsoft. Nor is a Linux version of Office on the drawing board. We’ll also have to wait to hear about other features that will make it into Office 12. The dribbling continues…  

Dan Farber, editor-in-chief of CNET News.com, has more than 20 years of experience as an editor and journalist covering technology. See his full profile and disclosure of his industry affiliations.

  • Talkback
  • Most Recent of 12 Talkback(s)
XML will fail
The hierarchical approach embodied in XML has already been tried in the 60s and 70s and has been overwhelmingly superceded by data management techniques based on the relational model (despite SQL's in... (Read the rest)
Posted by: jorwell Posted on: 06/03/05 You are currently: a Guest | | Terms of Use
I wonder about compatability  Roger Ramjet | 06/02/05
Internal format unimportant  jorwell | 06/02/05
OpenOffice  rapson | 06/02/05
Both are mistaken  jorwell | 06/02/05
Well then,  rapson | 06/02/05
We could start by forgetting about documents  jorwell | 06/02/05
blah, blah, blah  nottheusual1 | 06/02/05
XML will fail  jorwell | 06/03/05
Netscape, IE, XML, and now Office  aulax@... | 06/02/05
I suspect...  rapson | 06/02/05
MS working to help a competitor?  MrAnderson_z | 06/02/05
Full Fidelity? Not so fast!  JoshSale | 06/02/05

What do you think?

SponsoredWhite Papers, Webcasts, and Downloads

advertisement
Click Here

Recent Entries

Premier Vendor Content Whitepapers, webcasts & resources from our Power Center Sponsors
advertisement

Archives

Favorite Links

ZDNet Blogs

White Papers, Webcasts, and Downloads

Enterprise Applications

  • Check out some of the easiest and most powerful ways to boost productivity while saving money on your application infrastructure. See ZDNet's comprehensive Enterprise Application resource center, now!
  • New Online Dashboard
  • Read about top issues IT decision-makers face every day, plus get cost effective solutions to real life IT problems. Oracle Topline