Document Formats

PDF

PDF - Portable Document Format

Universal document format for sharing and printing

Overview

PDF (Portable Document Format) is the gold standard for document sharing. Created by Adobe, it preserves formatting, fonts, images, and layout across all platforms and devices.

Best Used For

Official documents, contracts, resumes, ebooks, forms, reports, and any document where preserving exact formatting is crucial.

Advantages
  • Universal compatibility
  • Preserves formatting perfectly
  • Supports digital signatures
  • Can be password protected
  • Supports forms and interactivity
Limitations
  • Difficult to edit without special software
  • Can create large file sizes
  • Text extraction may lose formatting
  • Not ideal for collaborative editing
DOCX

DOCX - Microsoft Word Document

Modern word processing format with rich features

Overview

DOCX is Microsoft Word's default format since 2007. It's an XML-based format that supports rich text formatting, images, tables, and advanced document features.

Best Used For

Business documents, reports, letters, manuscripts, collaborative documents, and any content requiring rich formatting and editing capabilities.

Advantages
  • Rich formatting options
  • Excellent for collaborative editing
  • Supports track changes
  • Smaller file size than DOC
  • Compatible with many apps
Limitations
  • Requires Word or compatible software
  • Formatting may vary between apps
  • Not ideal for final distribution
  • Can contain hidden metadata
RTF

RTF - Rich Text Format

Cross-platform format for formatted text

Overview

RTF (Rich Text Format) is a proprietary document format developed by Microsoft for cross-platform document interchange. It supports basic formatting while maintaining compatibility.

Best Used For

Sharing documents between different word processors, basic formatted documents, and situations where compatibility is more important than advanced features.

Advantages
  • Wide compatibility
  • Preserves basic formatting
  • Human-readable format
  • Smaller than DOCX for simple docs
Limitations
  • Limited formatting options
  • Large file sizes for complex docs
  • No advanced features
  • Outdated for modern needs

Text Formats

TXT

TXT - Plain Text

Universal format for unformatted text

Overview

Plain text files contain only character data without any formatting. They're the most basic and universally compatible file format, readable by virtually any application.

Best Used For

README files, notes, configuration files, logs, source code, and any content where formatting is unnecessary or undesirable.

Advantages
  • Universal compatibility
  • Tiny file sizes
  • Easy to process programmatically
  • Version control friendly
  • No hidden content
Limitations
  • No formatting options
  • No images or media
  • No tables or structure
  • Limited character encoding
MD

MD - Markdown

Lightweight markup language for formatted text

Overview

Markdown is a lightweight markup language that uses plain text formatting syntax. It's designed to be readable as-is while easily converting to HTML and other formats.

Best Used For

Documentation, README files, blog posts, notes, wikis, and any content that needs simple formatting while remaining readable in plain text.

Advantages
  • Human-readable source
  • Version control friendly
  • Converts to many formats
  • Simple syntax
  • Widely supported
Limitations
  • Limited formatting options
  • No standard specification
  • Tables can be cumbersome
  • Preview requires renderer

Data Formats

CSV

CSV - Comma-Separated Values

Simple format for tabular data

Overview

CSV files store tabular data in plain text, with each line representing a row and commas separating values. It's the most common format for data exchange between applications.

Best Used For

Spreadsheet data, database exports, data analysis, importing/exporting between applications, and any structured data that fits in rows and columns.

Advantages
  • Universal data exchange format
  • Human-readable
  • Small file sizes
  • Easy to generate and parse
  • Works with all spreadsheet apps
Limitations
  • No data types or formatting
  • Comma conflicts need escaping
  • No standard for encoding
  • Limited to flat structure
JSON

JSON - JavaScript Object Notation

Lightweight data interchange format

Overview

JSON is a text-based data format that's easy for humans to read and write, and easy for machines to parse and generate. It supports complex nested structures.

Best Used For

API responses, configuration files, data storage, web applications, and any scenario requiring structured data with nested objects and arrays.

Advantages
  • Human-readable
  • Supports complex structures
  • Native JavaScript support
  • Widely supported
  • Self-documenting
Limitations
  • No comments allowed
  • Verbose for simple data
  • No schema validation
  • Limited data types
XML

XML - Extensible Markup Language

Structured format for complex data

Overview

XML is a markup language that defines rules for encoding documents in a format that's both human-readable and machine-readable. It's highly structured and extensible.

Best Used For

Configuration files, data exchange between systems, document storage, web services (SOAP), and scenarios requiring strict structure and validation.

Advantages
  • Self-documenting
  • Supports validation (XSD)
  • Namespace support
  • Industry standard
  • Extensible
Limitations
  • Verbose syntax
  • Large file sizes
  • Complex to parse
  • Slower than alternatives

Code & Script Formats

JS

JavaScript Files (.js, .ts)

Script files for web and server applications

Overview

JavaScript and TypeScript files contain code for web applications, Node.js servers, and various other platforms. They're plain text files with programming logic.

Best Used For

Web development, server-side applications, build scripts, automation, and any JavaScript-based project. Combine related scripts for documentation or archival purposes.

PY

Python Files (.py)

Python programming language scripts

Overview

Python files contain code written in the Python programming language. They're used for data analysis, web development, automation, and scientific computing.

Best Used For

Data science projects, automation scripts, web applications, and combining multiple Python modules for documentation or distribution.

SQL

SQL Files (.sql)

Database query and schema files

Overview

SQL files contain database queries, schema definitions, and data manipulation commands. They're essential for database management and migration.

Best Used For

Database backups, migration scripts, combining multiple queries, and creating comprehensive database documentation or deployment packages.

Web Formats

HTML

HTML - HyperText Markup Language

Standard format for web pages

Overview

HTML is the standard markup language for creating web pages. It describes the structure and content of web documents using tags and attributes.

Best Used For

Web pages, email templates, documentation with formatting, and any content intended for web browsers. Ideal for preserving links and multimedia references.

Advantages
  • Universal web standard
  • Supports multimedia
  • Interactive elements
  • Extensive styling options
  • SEO-friendly
Limitations
  • Requires browser to view properly
  • Not ideal for printing
  • Can be complex for simple docs
  • Security considerations
CSS

CSS - Cascading Style Sheets

Styling language for web documents

Overview

CSS files contain styling rules for HTML documents. They control the visual presentation including colors, layouts, fonts, and animations.

Best Used For

Combining multiple stylesheets, creating style libraries, documentation of design systems, and consolidating CSS for optimization.

Format Comparison Guide

Format File Size Formatting Compatibility Best For
TXT Smallest None Universal Simple notes, logs
PDF Large Full Excellent Final documents
DOCX Medium Full Good Editable docs
HTML Small Full Web only Web content
CSV Small None Universal Data exchange
MD Small Basic Good Documentation

Quick Decision Guide: Use TXT for maximum compatibility, PDF for sharing final documents, DOCX for collaborative editing, HTML for web content, CSV for data, and Markdown for technical documentation.