Cut and Paste from Microsoft Word

Top  Previous  Next

Pasting from Microsoft Word Is Like putting Gum In Your Hair

(this content "inspired" by a well written article at http://realestatetomato.typepad.com/)

 
Everyday we come across 'broken documents'. The reason the document 'broke' is always the same: Microsoft Word was used to compose an article or document and now the formatting is badly corrupted.
 
The most common issues are challenges with font styles, sizes, bolding, italics and underlining. Other challenges include indenting, bulleting and numbering and total sidebar destruction.
 
Here's why there is a problem:

Microsoft Word is a desktop publishing tool, intended for standard, offline word processing.  While packages like Microsoft Front Page are used to produce clean HTML, Microsoft Word does not do this - Microsoft are unlikely to fix this problem because those same features that make Word so unfriendly to HTML editors make it an excellent word processor.
Microsoft Word is best used for items intended for printing or sharing offline: essays, business cards, menus, etc.
The proprietary code that Microsoft has written for their word processing program IS NOT HTML.

Below is an example of what their code can look like versus the more 'raw' version if it were written in HTML.
 
Raw HTML works efficiently with NeatClubs.COM because it can use CSS (Cascading Style Sheets) that 'govern' the look and feel of the text formatting and styling.
 
The font style, weight and size are all predefined in the code. This alleviates inconsistencies and allows for a uniform look among documents and blog articles that are published to the site. One does not have to worry about choosing font style, color, or even size; it is done automatically.
 
MS Word does not mix well with the CSS structure and can actually override the code, creating havoc throughout the rest of your site, not just the article you posted. A common example we see is when someone has posted an article using MS Word, and every other article on the site suddenly carries a formatting of bold, italic or underline. Not good.

If you absolutely "must" write your content in Microsoft Word, here are some tips and tricks that we've found to be effective:

1. Save your content as text first

The best solution to ensure clean content is to cut and paste your content into a Windows based program like "WordPad" and then cut and paste from WordPad into the NeatClubs.COM editor - this intermediate step will ensure that the proprietary Microsoft Word tags are eliminated from your document and that you have relatively "clean" HTML - this will mean that you need to do some reformatting in the editor, but in our experience it is almost always worth the extra effort.  When reformatting content try and use styles as mush as you can and avoid hardcoding particular fonts and colors into your document - doing so will make it difficult in the event that you want to change the color scheme or default fonts or font-sizes for your document.

2. Save your MS-Word contact in PDF format and upload the document to NeatClubs.COM

Although not a perfect solution, another approach is to print your Microsoft Word document to a PDF file and then upload the PDF file to NeatClubs.COM - PDF (portable document format) as it's name implies is ideally suited to reliably render documents consistently across a variety of browsers and operating systems. You can learn about options for creating PDF documents here.

3. Use the built in "Paste from Word" feature in the editor

As a last resort, the editor component we have built into NeatClubs.COM has a "Paste from Word" function. To use this, select and copy your word content from Word into the Windows clipboard and then select the Paste From word option and paste the content into the Window that appears.  The Editor will attempt to remove the proprietary markup from the Word Document, but in our experience it does a less than perfect job.