Many areas of public health, including vital statistics, investigation and research, surveillance, epidemiology, surveys, laboratories technology, maternal and child health, and environmental health, use information technology (IT) to achieve their goals and objectives. IT includes the use of computers and communications, and the transformation of data into information and knowledge.
In the 1960s, "number crunching" was one of the first applications for which computers were used in the hospital environment. A decade later, in the early 1970s, IT applications were being used
|Text, Graphics, Multimedia Common Sound, Still-Video, and Motion-Video Formats on the Web|
|Extension||Text & Graphics Formats||Explanation|
|TXT, DOC||MS Word||Word processing application|
|WPD||WordPerfect||Word processing application|
|RTF||Rich Text Format||Method of encoding text formatting and document structure using ASCII character set.|
|PPT||Microsoft PowerPoint||Presentation Graphics application|
|PRS||Harvard Graphics||Presentation Graphics application|
|XLS||MS Excel||Spreadsheet application|
|HTM||MS FrontPage||Editor: application for creating and editing Web pages. Explorer: application for maintaining, testing, and publishing webs.|
|RA||RealAudio||Used with RealAudio Web Server and RealAudio Player add-on for browsers|
|SBI||Sound Blaster Instrument||Used for a single instrument with Sound Blaster cards|
|WAV||MS Waveform||Sound format used in Windows for event notification|
|Extension||Still-Video/Graphics Formats (SVF)||Explanation (Raster or bitmap images)|
|GIF||Graphics Interchange Format||Compressed graphics format commonly used on CompuServe|
|PCC, PCX||PC Paintbrush||Bitmap images|
|JPEG, JPG||Joint Photographic Experts Group||Highly compressed format for still images, widely used for multi-platform graphics|
|TIFF||Tagged Image File Format||High-resolution, tag-based graphics format used for the universal interchange of digital graphics|
|PCD||Photo CD||A graphics file format developed by Eastman Kodak Company|
|Portable Document Format||Adobe's format for multi-platform document access through its Acrobat software|
|PS||PostScript||Adobe's type description language, used to deliver complex documents over the Internet|
in admissions, patient care, clinical laboratories, and intensive care units. In the 1990s, the fusion of computers and all forms of communication have become commonplace in all aspects of life. The Internet and the World Wide Web (WWW) are now tools that both professionals and laypeople use for all type of businesses. An evolution has occurred in the ways people use computers, in the power, capacity, and speeds of computers, and in
Table 1, continued
|Text, Graphics, Multimedia Common Sound, Still-Video, and Motion-Video Formats on the Web [CONTINUED]|
|Extension||Still-Video/Graphics Formats (SVF)||Vector Images|
|SOURCE: Courtesy of author.|
|CGM||Computer Graphics Metafile|
|WMF||Windows Metafile||Used mostly for word-processing clip art|
|WPG||WordPerfect Graphics||Word-processing clip art|
|DVI||Digital Video Interactive||MVFs found in CD-ROMs|
|FLI||Flick||Autodesk Animator MVF|
|MPEG, MPG||Motion Picture Experts Group||Full-motion video standard using frame format similar to JPEG with variable compression capabilities|
|MOV||Quick Time||Apple's motion video and audio format (originally for Macintosh, available for Windows)|
the way systems are put together and integrated (see Table 1).
Most people tend to think about computers in terms of the systems that they use at home or at work. Most of the time these are "stand-alone" models, such as desktops, laptops, or notebooks, and sometimes they are wireless devices, such as palm pilots, personal organizers, and third-generation cellular phones that allow access to e-mail and the Internet. Although public health has not yet taken full advantage of these technologies, it is important to understand the basics of these technologies in order to visualize their potential uses in the near and long-term future.
Initially, the computer was conceived as a device to manipulate numbers and solve arithmetical problems. During its development, it was recognized that a machine capable of manipulating numbers could also be used to manipulate any "symbol" represented in numeric form. An electronic data processing system (EDPS) involves at least three basic elements: the input entering the system, or
The central processing unit (CPU) is the control center of the EDPS, and it has two parts: the "arithmetic/logic unit" (ALU) and the "control unit." The ALU performs operations such as addition, subtraction, multiplication, and division; as well as moving, shifting, and comparing data. The control section of the CPU directs and coordinates all the operations of the computer according to the conditions set forth by the stored program. It selects instructions from the stored program and interprets them. It then generates signals and commands that cause other system units to perform certain operations at appropriate times. It controls the input/output units, the arithmetic-logic operations of the CPU, and the transfer of data to and from storage. It acts as a central nervous system, but performs no actual processing operations on data.
Storage Devices. The main storage of a computer—the memory, or internal storage unit— is basically an electronic filing cabinet where each location is capable of holding data and instructions. The storage unit contains four elements: (1) all data being held for processing, (2) the data being processed, (3) the final result of processing until it is released as output, and (4) all the program instructions while processing is being carried out. Each location in main storage is identified by a particular address. Using this address, the control section can readily locate data and instructions as needed. The size or capacity of main storage determines the amount of data and instructions that can be held within the system at any one time. In summary, the internal memory is a temporary storage and is called "random access memory" (RAM). There is also a second type of memory, called "read-only memory" (ROM). This memory is fixed; meaning it can be read but cannot be written to, changed, or deleted. There are also secondary memory devices or auxiliary storage, sometimes called "sequential access memory," such as diskettes, hard drives, and magnetic tape. Depending on how often the data will be used these auxiliary devices will be chosen. For example, mass storage devices or certain types of tapes may be used for archival purposes of medical records or bank accounts, where certain legal aspects of the data may be required.
Input/Output (I/O) Devices. These are devices that are linked to the computer and can introduce data into the system, and devices that can accept data after it has been processed. Some examples are: disk storage drives, printers, magnetic tape units, display stations, data transmission units, and the old punched card or paper tape. Input devices perform the function of converting the data from a form that is intelligible to the user to a form that is intelligible to the computer. Output, on the other hand, is data that has been processed, (e.g., shown on a display device). In some cases, a printer can readily display the data in an understandable form. In other instances, such as with a magnetic tape drive, the data is carried as input for further processing by another device. In this case, the computer retains the data until further processing takes place. In summary, a digital computer identifies an electronic device capable of manipulating bits of information under the control-sequenced instructions stored within the memory of the device. Some common forms of storing data today include: floppy disks (used mainly for temporary storage); magnetic disks (fixed or removable); and optical disks that can store very large amounts of data. CD-ROM (compact disk— read only memory) devices store the information by means of a finely focused laser beam that detects reflections from the disc. This technology is sometimes referred by the term "write once, read many times" (WORM).
Computer System. The computer elements described thus far are known as "hardware." A computer system has three parts: the hardware, the software, and the people who make it work. The computer software can broadly be divided in two categories: systems software and application software or programs. These systems software can be further divided into: operating systems and programming languages. A computer program is a set of commands (in the form of numeric codes) that is put into the computer's memory to direct its operation. Testing, or debugging, is done to check if a program works properly. The ongoing process of correcting errors and modifying working programs is called software maintenance. The science of software engineering has provided formal methods for writing and testing programs.
DATA PROCESSING, DATA REPRESENTATION
When people communicate by writing in any language, the symbols used (the letters of the alphabet, numerals, and punctuation marks) convey information. The symbols themselves are not information, but representations of information. Data in an EDPS must be expressed symbolically so that the machines can interpret the information presented by humans. In general, the symbols that are read and interpreted by a machine differ from those used by people. The designer of a computer system determines the nature and meaning of a particular set of symbols that can be read and interpreted by the system. The actual data that is used by these systems is (or was in the past) presented as holes on punched cards or paper tape, as spots on magnetic tape, as bits (binary digit) or bytes of information in a disk, diskette, CD-ROM, or optical disk; as magnetic-ink characters; as pixels in display-screen images; as points in plotted graphs; or as communication-network signals.
In many instances, communication occurs between machines. This communication can be a direct exchange of data in electronic form over cables, wires, radio waves, infrared, satellites or even wireless devices such as cellular phones, pagers, and hand-held personal organizers and/or notebooks. It can also be an exchange where the recorded or stored output of one device or system becomes the input of another machine or system.
In the computer, data is recorded electronically. The presence or absence of a signal in specific circuitry represents data in the computer the same way that the absence or presence of a punched hole represented data in a punched card. If we think of an ordinary lightbulb being either on or off, we could define its operation as a binary mode. That means that at any given time the lightbulb can be in only one of two possible conditions. This is known as a "binary state." In a computer, transistors are conducting or nonconducting; magnetic materials are magnetized in one direction or in the opposite direction; a switch or relay is either on or off, a specific voltage is either present or absent. These are all binary states. Representing data within the computer is accomplished by assigning a specific value to each binary indication or group of binary indications. Binary signals can be used to represent both instructions and data; consequently the basic language of the computer is based primarily on the "binary number system."
A binary method of notation is usually used to illustrate binary indications. This method uses only two symbols: 0 and 1, where 0 and 1 represent the absence and presence of an assigned value, respectively. These symbols, or binary digits, are called "bits." A group of eight bits is known as a "byte," and a group of 32 bits (4 bytes) is known as a "word." The bit positions within a byte or a word have place values related to the binary number system. In the binary number system the values of these symbols are determined by their positions in a multidigit numeral. The position values are based on the right to left progression of powers having a base of 2 (20, 21, 22, 23), commonly employed within digital computers. For example, if there are four light bulbs next to each other numbered 4, 3, 2, and 1 and 1 and 3 are "on" and 2 and 4 are "off," the binary notation is 0101.
The system of expressing decimal digits as an equivalent binary value is known as Binary Coded Decimal (BCD). In this code, all characters (64 characters can be coded), including alphabetic, numeric, and special signs, are represented using six positions of binary notation (plus a parity bit position). The Extended Binary Coded Decimal Interchange Code (EBCDIC) uses eight binary positions for each character format plus a position for parity checking (256 characters can be coded). The American Standard Code for Information Interchange (ASCII) is a seven-bit code that offers 128 possible characters. ASCII was developed by users of communications and data processing equipment as an attempt to standardize machine-to-machine and system-to-system communication.
Computer Number Systems and Conversions. Representing a decimal number in binary numbers may require very long strings of ones and zeros. The hexadecimal system is used as a shorthand method to represent them. The base of this system is 16, and the symbols used are: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E and F. In other words, F is 15 in decimal notation and 1 1 1 1 in binary.
Programming Languages Techniques. Assembler languages are closer to machine instructions than to human language, and having to express
A programmer writes a source program in a human-readable programming language. A compiler translates these English-like statements into instructions that the computer can execute—such instructions are called an "object program." Through added library routines the computer does further processing of the object program, executes it, and an "output" is produced. There are some "optimizing compilers" that automatically correct obvious inefficiencies in source programming. Sometimes, with the use of "interpreters," debugging can be done to a program as it executes the user program piece by piece. MUMPS, LISP, and APL are interpreters used for this purpose in the health care environment, artificial intelligence, and mathematics fields, respectively. Because of the time and costs associated with development, it is generally not cost effective in today's environment to develop an application package, but rather buy it (if available) from a vendor. The costs are thus spread among thousands of users. Typical applications packages used for public health purposes are SAS and SPSS (for biostatistics) and ArcView/GIS (for Geographical Information Systems). In addition there are some data manipulation languages (e.g., Oracle and dBASE) that were written with this purpose. A database manipulation language (DML) is a special sublanguage used for handling data storage and retrieval in a database system. Using a data definition language (DDL), programmers can organize and structure data on secondary storage devices.
Data Acquisition. Capturing and entering data into a computer is expensive. Direct acquisition of data avoids the need for people to read values and measure, encode, and/or enter the data. Automated data acquisition can help eliminate errors and speed up the procedure. Sensors connected to a patient convert biological signals into electrical signals that are transmitted into a computer. Many times these signals (e.g., ECG, blood pressure, heart rate) are analog signals, and in order to be stored into a digital signal a conversion needs to occur. This process is called analog to digital conversion (ADC).
DATABASES AND DATABASE MANAGEMENT SYSTEMS
A database (DB) system is a computer-based record keeping system used to record and maintain certain types of information that have a significant value to some organization. A DB is a repository of stored data, which in general is both integrated and shared. Between the physical database and the users of the system is a layer of software, usually called the database management system (DBMS). All requests from the users to access the DB are handled by the DBMS.
When trying to organize the data and information within an organization, the DB helps the user in entering, storing, and retrieving it, and when trying to integrate all or part of the information of the enterprise the DB becomes a key player. Normally, within the DB, information is organized into data elements, fields, records, and files. In a system such as a hospital information system (HIS), a patient name is a data element or a field; a record could be related to that patient's visit on a particular date (e.g., date, diagnoses, treatments, charges, medications, tests) at a particular time; and a file would contain all the information from all the visits for that patient. An HIS DB will include not only patient files, but it could also have accounting information related to charges, inventory, payroll, and personnel records. With DB systems, different people can have access to different parts of the system, so, for example, not all personnel employees will have access to laboratory results.
The DBMS organization and definition of the contents of the individual data elements, fields, records, and files are provided via a machine-readable definition called "schema." This creates an independence of physical location from logical location of the content of a DB. The DBMS not only "manages the DB" but also allows for entering, editing, and retrieving results. The DBMS helps with the integration of data coming from multiple sources. The user can also access and retrieve specific types of information via queries.
A DB provides an organization with centralized control of its operational data. Some of the
- Redundancies can be reduced.
- Inconsistencies can be avoided.
- Data can be shared.
- Standards can be enforced.
- Privacy, confidentiality, authenticity, and security restrictions can be applied.
- Integrity can be maintained.
- Conflicting requirements (among users) can be balanced (for the enterprise).
- Data is easier to support (the single repository, the application, and the endusers).
Due to technological advancements, databases today are much more complex than a few decades ago. They contain "multimedia" information, such as text, graphics, scanned images from documents, clinical images from all modalities (X-rays, ultra-sound, MRI, CT scan), still and dynamic studies, and sound. When doing population studies, the creation of data "warehouses" is necessary, and data "mining" techniques are used to extrapolate results. In public health, the data needed for a study can reside in a small computer, in a local area network (LAN), or in a wide area network (WAN). In order to use information that is geographically distributed (and/or with distributed users) it is important to learn techniques for data integration and data communications. Because of the continuing fusion of computers and communications, this is the fastest changing area within information technology.
INTERNET AND THE WORLD WIDE WEB
There is little historical precedent for the swift and dramatic growth of the Internet, which was originally a limited scientific communication network developed by the U.S. government to facilitate cooperation among federal researchers and the university research community. With its rapid adoption by the private sector, the Internet has remained an important research tool, and it is also becoming a vital ingredient in maintaining and increasing the scientific and commercial leadership of the United States. In the twenty-first century, the Internet will provide a powerful and versatile environment for business, education, culture, entertainment, health care and public health. Sight, sound, and even touch will be integrated through powerful computers, displays, and networks. People will use this environment to work, study, bank, shop, entertain, visit with each other, and communicate with their health care providers. Whether at the office, at home, or traveling, the environment and its interface will be largely the same, and security, reliability, and privacy will be built in. Benefits of this dramatically different environment will include a more agile economy, improved health care (particularly in rural areas), less stress on ecosystems, easy access to lifelong and distance learning, a greater choice of places to live and work, and more opportunities to participate in the community, the nation, and the world.
Internet and WWW Acronyms. People that communicate with each other electronically may not have the same "platform." "Cross-platform" means that people do not have to use the same kind of operating system to access files on a remote system. In order to access the Web there are two basic mechanisms: (1) using the telephone system to link to another computer or network that is connected to the Internet, and (2) connecting to a network; and from there into the Internet. An Internet service provider (ISP) may be required to access the Internet. An important factor regarding Internet access is bandwidth, which determines how much data a connection can accommodate and the speed at which data can be accessed.
Information on the Web is generally written in Hypertext Markup Language (HTML), which is a text-based markup language that describes the structure of a Web document's content and some of its properties. It can also be viewed as a way of representing text and linking it to other resources, such as multimedia files, graphic files, still or dynamic images files, and sound files. HTML contains the information or text to be displayed and the control needed for its display or playback.
Navigation Tools. Prior to the use of Web browsers, there were several Internet navigation tools that required more user expertise than the modern browser, including:
- File Transfer Protocol (FTP), a cross-platform protocol for transferring files to and from computers anywhere on the Internet.
- Gopher, a tool for browsing files on the Internet.
- Usenet, a worldwide messaging system through which anyone can read and post articles to a group of individuals who share the same interests.
- Wide Area Information Server (WAIS), one of a handful of Internet search tools that can be spread across the network to scour multiple archives and handle multiple data formats.
- Hyperlink (also called link), a pointer— from text, from a picture or a graphic, or from an image map—to a page or file on the World Wide Web; hyperlinks are the primary way to navigate between Web pages and among Web sites.
Today, a Web browser is the main piece of software required by the end user to find information through Internet. Some of the most popular browsers are: Lynx, Mosaic, Netscape Navigator/Communicator, and Internet Explorer. Lynx is a text-only Web browser; it cannot display graphical or multimedia elements. Mosaic, a graphical Web browser, was the first "full-featured" graphical browser for the Web. It was developed by a team of programmers at the National Center for Supercomputing Applications (NCSA). One of these programmers, Marc Andreesen, later formed Netscape. Netscape Navigator/Communicator is one of the most popular Web browsers. Internet Explorer is Microsoft's Web browser.
Web Resources. A Uniform Resource Locator (URL) is a Web resource that describes the protocols needed to access a particular resource or site on the Web, and then point to the resource's Internet location. URLs are, in short, used to locate information on the Web.
Normally the URL is composed of six parts:
- The protocol or data source (i.e., ftp://, gopher://, news://, telnet://, WAIS://, http://)
- The domain name (for the Web server where the desired information resides)
- The port address
- The directory path (location of the Web page in the Web server's file system)
- The object name
- The spot (precise location within the file)
Protocols are the rules and formats that govern the methods by which computers communicate over a network. Protocols link clients and servers together and handle requests and responses, including making a connection, making a request, and the closing of the connection. Transmission Control Protocol/Internet Protocol (TCP/IP) is the full set of standard protocols used on the Internet. Hypertext Transfer Protocol (HTTP) is an Internet protocol specifically for the World Wide Web. It provides a way for Web clients and servers to communicate primarily through the exchange of messages.
Multipurpose Internet Mail Extension (MIME) is a technique designed to insert attachments within individual e-mail files. MIME allows a Web server to deliver multiple forms of data to the user in a single transfer. Also, when creating a Web page, it could include text files as well as nontext files, such as sound, graphics, still images, and videos.
Intersection and Information Technology and Public Health. The applications of IT in public health are numerous and varied. One particularly important example, however, is the use of Geographical Information Systems (GIS). Using GIS, public health officials can create very effective procedures to do their tasks using information technology. Doing a feedback loop they can: measure, plan, act, and measure again. In this manner, officials can identify a problem (e.g., cancer) by measuring data from a registry. Further, from the health care providers community, they can select a target population (e.g., breast cancer) and develop an implementation strategy for an intervention plan with the health care providers. Finally, by measuring again, GIS allows public health officials to evaluate the impact of the implementation plan on that data registry.
GIS is thus an information technology which can help improve health care and public health in
GIS can also help create disease focused databases representing patients from a specific userdefined geographic area. In this fashion, the impact of a toxic release or exposure against a target population can be measured. GIS is a powerful tool for supplying immediate visualization of the likely geographic exposures, allows an analyst to examine the various variables that might effect the "fallout" of sprayings and to estimate its extent. Through the use of Computer Aided Design tools and GIS, medical centers as well as clinics are increasingly monitoring their patient care environments to assist managers evaluate risk for highly contagious diseases and implement control and isolation programs.
GIS helps health organizations visualizing diagnostic and geographic information simultaneously and dynamically. Over 14,000 ICD 9 and 10 codes describe medical diagnosis, treatment, and medical events worldwide. Public health clinics, hospitals, managed care, and health insurers use this application to conduct data mining on very large clinical and administrative data warehouses.
In public health education, GIS can be an analytical tool of choice for health promotions staff when deciding where to target the public health messages and warnings. GIS is also used to create interactive maps for health organizations required to publish information to the public. Health organizations require interactive maps depicting geographical areas and regions where infectious diseases and threats to the public's health are imminent.
LUIS G. KUN
Adams, J. B. (1986). "Three Surveillance and Query Languages for Medical Care." M.D. Computing 3:11.
American Medical Informatics Association (1997). "A Proposal to Improve Quality, Increase Efficiency, and Expand Access in the U.S. Health Care System." JAMA 4:340–341.
Bronzino, J. D. (1982). Computer Applications for Patient Care. Boston, MA: Addison-Wesley.
Collen, M. F., ed. (1997). Multiphasic Health Testing Services. New York: John Wiley & Sons.
Council on Competitiveness (1996). Highway to Health: Transforming the U.S. Health Care in the Information Age. Available at http://www.compete.org/bookstore/book_index.html.
DeFriese, G. H., ed. (1987). "A Research Agenda for Personal Health Risk Assessment Methods in Health Hazard/Health Risk Appraisal." Health Services Research 22:442.
Federal Communications Commission. Health Care and the FCC. Available at http://www.fcc.gov/healthnet/.
Fitzmaurice, M. (1994). Putting the Information Infrastructure to Work Health Care and the NII. Washington, DC: Department of Health and Human Services. Available at http://nii.nist.gov/pubs/sp857/health.html.
—— (1995). "Computer Based Patient Records." In The Biomedical Engineering Handbook, ed. J. Bronzino. Boca Raton, FL: CRC Press.
Kun, L. (1999). "The Global Health Network of the 21st Century: Telehealth, Homecare, Genetics, Counter-Bioterrorism, Security, and Privacy of Information, Do We Need It and Are We Ready For It?" HPCN Conference, ISIS–ITAB'99, Amsterdam, Netherland. April. Available at http://www.hoise.com/vmw/99/articles/vmw/lv-vm-05–99-14.html.
LaPorte, R. E. (1994). Towards a Global Health Network. Pittsburgh, PA: University of Pittsburgh. Available at http://www.pitt.edu/HOME/GHNet/GHNet.html.
LaPorte R. E.; Akazawa, S.; Hellmonds, P.; Boostrom, E.; Gamboa, C.; Gooch, T.; Hussain, F.; Libman, I.; Marler, E.; Roko, K.; Sauer, F.; and Tajima, N. (1994). "Global Public Health and the Information Super-highway." British Medical Journal 308:1651–1652.
Lasker, R. D.; Humphreys, B. L.; and Braithwaite, W. R. (1995). Making a Powerful Connection: The Health of the Public and the National Information Infrastructure Report of the U.S. Public Health Service. Washington, DC: U.S. Public Health Service.
National Coordination Office for Computing, Information, and Communications. The Next Generation Internet. Available at http://www.ccic.gov/ngi/.
National Research Council (1997). For the Record: Protecting Electronic Health Information. Washington, DC: National Academy Press.
National Science and Technology Council (1999). Information Technology Frontiers for a New Millennium. A Report by the Subcommittee on Computing, Information, and Communications R&D, Committee on Technology. Washington, DC: Author.
Office of Technology Assessment (1997). Policy Implications of Medical Information Systems. Washington, DC:U.S. Government Printing Office.
Schiller, A. E. (1992). Telecommunications: Can It Help Solve America's Health Care Problems? Cambridge, MA: Arthur D. Little.
Shortliffe, E. H., and Perreault, L. E. (1990). Medical Informatics: Computer Applications in Health Care. Boston, MA: Addison Wesley.
Smith, J., and Weingarten, F., eds. (1997). Research Challenges for the Next Generation Internet. Washington, DC: Computing Research Association. Available at http://www.cra.org.
Wiederhold, G. (1981). Databases for Health Care. New York: Springer-Verlag.