Digitization


Digitization is the process of converting information into a digital format, i.e., a format that can be read by computers. The result is the representation of an object, image, sound, document, or signal obtained by generating a series of numbers that describe a discrete set of points or samples. The result of this conversion is called digital representation or, more specifically, a digital image for the object and digital form for the signal. In contemporary practice, the digitized data is expressed as binary numbers, thereby enabling processing by digital computers and other operations. However, the fundamental process of digitizing entails "the conversion of analog source material into a numerical format"; the decimal or any other number system can be used instead.
Digitization is of crucial importance to data processing, storage, and transmission because it "allows information of all kinds in all formats to be carried with the same efficiency and also intermingled." Though analog data is typically more stable, digital data has the potential to be more easily shared and accessed and, in theory, can be propagated indefinitely without generation loss, provided it is migrated to new, stable formats as needed. This potential has led to institutional digitization projects designed to improve access and the rapid growth of the digital preservation field.
Sometimes digitization and digital preservation are mistaken for the same thing. They are different, but digitization is often a vital first step in digital preservation. Libraries, archives, museums, and other memory institutions digitize items to preserve fragile materials and create more access points for patrons. Doing this creates challenges for information professionals and solutions can be as varied as the institutions that implement them. Some analog materials, such as audio and video tapes, are nearing the end of their life cycle, and it is important to digitize them before equipment obsolescence and media deterioration makes the data irretrievable.
There are challenges and implications surrounding digitization including time, cost, cultural history concerns, and creating an equitable platform for historically marginalized voices. Many digitizing institutions develop their own solutions to these challenges.
Mass digitization projects have had mixed results over the years, but some institutions have had success even if not in the traditional Google Books model. Although e-books have undermined the sales of their printed counterparts, a study from 2017 indicated that the two cater to different audiences and use-cases. In a study of over 1400 university students it was found that physical literature is more apt for intense studies while e-books provide a superior experience for leisurely reading.
Technological changes can happen often and quickly, so digitization standards are difficult to keep updated. Professionals in the field can attend conferences and join organizations and working groups to keep their knowledge current and add to the conversation.

Process

The term digitization is often used when diverse forms of information, such as an object, text, sound, image, or voice, are converted into a single binary code. The core of the process is the compromise between the capturing device and the player device so that the rendered result represents the original source with the most possible fidelity, and the advantage of digitization is the speed and accuracy in which this form of information can be transmitted with no degradation compared with analog information.
Digital information exists as one of two digits, either 0 or 1. These are known as bits and the sequences of 0s and 1s that constitute information are called bytes.
Analog signals are continuously variable, both in the number of possible values of the signal at a given time, as well as in the number of points in the signal in a given period of time. However, digital signals are discrete in both of those respects – generally a finite sequence of integers – therefore a digitization can, in practical terms, only ever be an approximation of the signal it represents.
Digitization occurs in two parts:
;Discretization: The reading of an analog signal A, and, at regular time intervals, sampling the value of the signal at the point. Each such reading is called a sample and may be considered to have infinite precision at this stage;
;Quantization: Samples are rounded to a fixed set of numbers, a process known as quantization.
In general, these can occur at the same time, though they are conceptually distinct.
A series of digital integers can be transformed into an analog output that approximates the original analog signal. Such a transformation is called a digital-to-analog conversion. The sampling rate and the number of bits used to represent the integers combine to determine how close such an approximation to the analog signal a digitization will be.

Examples

The term is used to describe, for example, the scanning of analog sources into computers for editing, 3D scanning that creates 3D modeling of an object's surface, and audio and texture map transformations. In this last case, as in normal photos, the sampling rate refers to the resolution of the image, often measured in pixels per inch.
Digitizing is the primary way of storing images in a form suitable for transmission and computer processing, whether scanned from two-dimensional analog originals or captured using an image sensor-equipped device such as a digital camera, tomographical instrument such as a CAT scanner, or acquiring precise dimensions from a real-world object, such as a car, using a 3D scanning device.
Digitizing is central to making digital representations of geographical features, using raster or vector images, in a geographic information system, i.e., the creation of electronic maps, either from various geographical and satellite imaging or by digitizing traditional paper maps or graphs.
"Digitization" is also used to describe the process of populating databases with files or data. While this usage is technically inaccurate, it originates with the previously proper use of the term to describe that part of the process involving digitization of analog sources, such as printed pictures and brochures, before uploading to target databases.
Digitizing may also be used in the field of apparel, where an image may be recreated with the help of embroidery digitizing software tools and saved as embroidery machine code. This machine code is fed into an embroidery machine and applied to the fabric. The most supported format is DST file. Apparel companies also digitize clothing patterns.

History

  • 1957 The Standards Electronic Automatic Computer was invented. That same year, Russell Kirsch used a rotating drum scanner and photomultiplier connected to SEAC to create the first digital image from a photo of his infant son. This image was stored in SEAC memory via a staticizer and viewed via a cathode ray oscilloscope.
  • 1971 Invention of Charge-Coupled Devices that made conversion from analog data to a digital format easy.
  • 1986 work started on the JPEG format.
  • 1990s Libraries began scanning collections to provide access via the world wide web.

    Analog signals to digital

Analog signals are continuous electrical signals; digital signals are non-continuous. Analog signals can be converted to digital signals by using an analog-to-digital converter.
The process of converting analog to digital consists of two parts: sampling and quantizing. Sampling measures wave amplitudes at regular intervals, splits them along the vertical axis, and assigns them a numerical value, while quantizing looks for measurements that are between binary values and rounds them up or down.
Nearly all recorded music has been digitized, and about 12 percent of the 500,000+ movies listed on the Internet Movie Database are digitized and were released on DVD.
Digitization of home movies, slides, and photographs is a popular method of preserving and sharing personal multimedia. Slides and photographs may be scanned quickly using an image scanner, but analog video requires a video tape player to be connected to a computer while the item plays in real time. Slides can be digitized quicker with a slide scanner such as the Nikon Coolscan 5000ED.
Another example of digitization is the VisualAudio process developed by the Swiss Fonoteca Nazionale in Lugano, by scanning a high resolution photograph of a record, they are able to extract and reconstruct the sound from the processed image.
Digitization of analog tapes before they degrade, or after damage has already occurred, can rescue the only copies of local and traditional cultural music for future generations to study and enjoy.

Analog texts to digital

Academic and public libraries, foundations, and private companies like Google are scanning older print books and applying optical character recognition technologies so they can be keyword searched, but as of 2006, only about 1 in 20 texts had been digitized. Librarians and archivists are working to increase this statistic and in 2019 began digitizing 480,000 books published between 1923 and 1964 that had entered the public domain.
Unpublished manuscripts and other rare papers and documents housed in special collections are being digitized by libraries and archives, but backlogs often slow this process and keep materials with enduring historical and research value hidden from most users. Digitization has not completely replaced other archival imaging options, such as microfilming which is still used by institutions such as the National Archives and Records Administration to provide preservation and access to these resources.
While digital versions of analog texts can potentially be accessed from anywhere in the world, they are not as stable as most print materials or manuscripts and are unlikely to be accessible decades from now without further preservation efforts, while many books manuscripts and scrolls have already been around for centuries. However, for some materials that have been damaged by water, insects, or catastrophes, digitization might be the only option for continued use.