Text editor


A text editor is interactive software that allows a user to edit plain text, such as Notepad.
As with any software, a text editor can be installed onto a system, but often a relatively simple text editor is included in a default installation of an operating system since editing text files is a relatively basic need for any system and since a simple text editor can be provided at relatively low cost.
As source code is text, any text editor can be used to edit code, but a source-code editor is designed with features specifically intended for editing code. Some provide integration with software development tools and provide a debugging environment.

Plain text and rich text

There are important differences between plain text and rich text.
Plain text exclusively consists of character representation. Each character is represented by a fixed-length sequence of one, two, or four bytes, or as a variable-length sequence of one to four bytes, in accordance to specific character encoding conventions, such as ASCII, ISO/IEC 2022, Shift JIS, UTF-8, or UTF-16. These conventions define many printable characters, but also non-printing characters that control the flow of the text, such as space, line break, and page break. Plain text contains no other information about the text itself, not even the character encoding convention employed. Plain text is stored in text files, although text files do not exclusively store plain text. Since the early days of computers, plain text was generally displayed using a monospace font, such that horizontal alignment and columnar formatting were sometimes done using whitespace characters.
Rich text, on the other hand, may contain metadata, character formatting data, paragraph formatting data, and page specification data. Rich text can be very complex. Rich text can be saved in binary format, text files adhering to a markup language, or in a hybrid form of both.
Text editors are intended to open and save text files containing either plain text or anything that can be interpreted as plain text, including the markup for rich text or the markup for something else.

History

Before text editors existed, computer text was punched into cards with keypunch machines. Physical boxes of these thin cardboard cards were then inserted into a card reader. Magnetic tape, drum and disk card image files created from such card decks often had no line-separation characters at all, and assumed fixed-length 80- or 90-character records. An alternative to cards was Punched tape. It could be created by some teleprinters, which used special characters to indicate ends of records. Some early operating systems included batch text editors, either integrated with language processors or as separate utility programs; one early example was the ability to edit SQUOZE source files for SCAT in the SHARE Operating System.
The first interactive text editors were "line editors" oriented to teletypes: teleprinter-style or typewriter-style terminals that mechanically printed both input and output on the same continuous roll of paper, without an illuminated display. Commands effected edits to a file at an insertion point, that the operator needed to keep track of, called the "cursor". Edits were verified by typing a command to print a small section of the file, and periodically by printing the entire file. In some line editors, the cursor could be moved by commands that specified the line number in the file, text strings for which to search, and eventually regular expressions. Line editors were major improvements over keypunching. Some line editors could be used by keypunch; editing commands could be taken from a deck of cards and applied to a specified file. Some common line editors supported a "verify" mode in which change commands displayed the altered lines.
An example configuration, circa 1975, was a Teletype Model 33 as a console to a PDP-11 using Version 6 Unix, manipulating text with the ed, the standard UNIX text editor.
When computer terminals with video screens became available, screen-based text editors became common. One of the earliest full-screen editors was O26, which was written for the operator console of the CDC 6000 series computers in 1967. Another early full-screen editor was vi. Written in the 1970s, it is still a standard editor on Unix and Linux operating systems. Also written in the 1970s was the UCSD Pascal Screen Oriented Editor, which was optimized both for indented source code and general text. Emacs, one of the first free and open-source software projects, is another early full-screen or real-time editor, one that was ported to many systems. The 1977 Commodore PET was the first mass-market computer to feature a full-screen editor. A full-screen editor's ease-of-use and speed motivated many early purchases of video terminals.
The core data structure in a text editor is the one that manages the string or list of records that represents the current state of the file being edited.
While the former could be stored in a single long consecutive array of characters,
the desire for text editors that could more quickly insert text, delete text, and undo/redo previous edits led to the development of more complicated sequence data structures.
A typical text editor uses a gap buffer, a linked list of lines, a piece table, or a rope, as its sequence data structure.

Typology

Some text editors are small and simple, while others offer broad and complex functions. For example, Unix and Unix-like operating systems have the pico editor, but many also include the vi and Emacs editors. Microsoft Windows systems come with the simple Notepad, though many people—especially programmers—prefer other editors with more features. Under Apple Macintosh's classic Mac OS there was the native TeachText later replaced by SimpleText in 1994, which was replaced in Mac OS X by TextEdit, which combines features of a text editor with those typical of a word processor such as rulers, margins and multiple font selection. These features are not available simultaneously, but must be switched by user command, or through the program automatically determining the file type.
Most word processors can read and write files in plain text format, allowing them to open files saved from text editors. Saving these files from a word processor, however, requires ensuring the file is written in plain text format, and that any text encoding or BOM settings will not obscure the file for its intended use. Non-WYSIWYG word processors, such as WordStar, are more easily pressed into service as text editors, and in fact were commonly used as such during the 1980s. The default file format of these word processors often resembles a markup language, with the basic format being plain text and visual formatting achieved using non-printing control characters or escape sequences. Later word processors like Microsoft Word store their files in a binary format and are almost never used to edit plain text files.
Some text editors can edit unusually large files such as log files or an entire database placed in a single file. Simpler text editors may just read files into the computer's main memory. With larger files, this may be a slow process, and the entire file may not fit. Some text editors do not let the user start editing until this read-in is complete. Editing performance also often suffers in nonspecialized editors, with the editor taking seconds or even minutes to respond to keystrokes or navigation commands. Specialized editors have optimizations such as only storing the visible portion of large files in memory, improving editing performance.
Some editors are programmable, meaning, e.g., they can be customized for specific uses. With a programmable editor it is easy to automate repetitive tasks or, add new functionality or even implement a new application within the framework of the editor. One common motive for customizing is to make a text editor use the commands of another text editor with which the user is more familiar, or to duplicate missing functionality the user has come to depend on. Software developers often use editor customizations tailored to the programming language or development environment they are working in. The programmability of some text editors is limited to enhancing the core editing functionality of the program, but Emacs can be extended far beyond editing text files—for web browsing, reading email, online chat, managing files or playing games and is often thought of as a Lisp execution environment with a Text User Interface. Emacs can even be programmed to emulate Vi, its rival in the traditional editor wars of Unix culture.
An important group of programmable editors uses REXX as a scripting language. These "orthodox editors" contain a "command line" into which commands and macros can be typed and text lines into which line commands and macros can be typed. Most such editors are derivatives of ISPF/PDF EDIT or of XEDIT, IBM's flagship editor for VM/SP through z/VM. Among them are THE, KEDIT, X2, Uni-edit, and SEDIT.
A text editor written or customized for a specific use can determine what the user is editing and assist the user, often by completing programming terms and showing tooltips with relevant documentation. Many text editors for software developers include source code syntax highlighting and automatic indentation to make programs easier to read and write. Programming editors often let the user select the name of an include file, function or variable, then jump to its definition. Some also allow for easy navigation back to the original section of code by storing the initial cursor location or by displaying the requested definition in a popup window or temporary buffer. Some editors implement this ability themselves, but often an auxiliary utility like ctags is used to locate the definitions.

Typical features

; Cut, copy, and paste: Most text editors provide methods to duplicate and move text within the file, or between files.
; Clipboard integration: Cut, copy and paste are often integrated with a system clipboard.
; Find and replace: Text editors provide extensive facilities for searching and replacing strings of text, either individually, or groups of files in opened tabs or a selected folder. Advanced editors can use regular expressions to search and edit text or code. Additional features may include optional case sensitivity, a history of search terms for quick recall and autocompletion, and listing multiple results in one place.
; Undo and redo: Way to undo and redo previous edits. Often—especially with older text editors—there is only one level of edit history remembered and successively issuing the undo command will only "toggle" the last change. Modern or more complex editors usually provide a multiple-level history such that issuing the undo command repeatedly will revert the document to successively older edits. A separate redo command will cycle the edits "forward" toward the most recent changes.
; UTF-8 support: Ability to handle UTF-8 encoded text.
; Basic formatting: Text editors often provide basic formatting features like line wrap and auto-indentation.
; Jump to line: Ability to jump to a specified line number.