Legal Information Retrieval

Legal Information Retrieval

Retrieval is based on automatic searching of documents for those embodying certain subject content. Such a system may involve automatic preprocessing of documents to form indexes or other structures to facilitate retrieval, but may not include any human intervention such as manual indexing of documents. Many commercial retrieval systems include accessability to databases that have not been manually indexed.

In the article “Concept and Context in Legal Information Retrieval”, authors K. Tamsin Maxwell and Burkhard Schafe, from the University of Edinburgh, observe that there exist two broad approaches to information retrieval (IR) in the legal domain: those based on manual knowledge engineering (KE) and those based on natural language processing (NLP). The KE approach is grounded in artificial intelligence (AI) and case-based reasoning (CBR), whilst the NLP approach is associated with open domain statistical retrieval.

Related Main Concepts in Legal Information Retrieval

Case-based reasoning

Case-based reasoning (CBR), broadly construed, is the process of solving new problems based on the solutions of similar past problems. An auto mechanic who fixes an engine by recalling another car that exhibited similar symptoms is using case-based reasoning. A lawyer who advocates a particular outcome in a trial based on legal precedents or a judge who creates case law is using case-based reasoning.

Document retrieval

Document retrieval is defined as the matching of some stated user query against a set of free-text records. These records could be any type of mainly unstructured text, such as newspaper articles, real estate records or paragraphs in a manual. User queries can range from multi-sentence full descriptions of an information need to a few words. Document retrieval is sometimes referred to as, or as a branch of, Text Retrieval.

Knowledge engineering

Knowledge engineering (KE) was defined in 1983 by Edward Feigenbaum, and Pamela McCorduck as follows: KE is an engineering discipline that involves integrating knowledge into computer systems in order to solve complex problems normally requiring a high level of human expertise. At present, it refers to the building, maintaining and development of knowledge-based systems.

Natural language processing

Natural language processing (NLP) is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages. Specifically, it is the process of a computer extracting meaningful information from natural language input and/or producing natural language output. In theory, natural language processing is a very attractive method of human-computer interaction.

Information retrieval

Information retrieval (IR) is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web. There is overlap in the usage of the terms data retrieval, document retrieval, information retrieval, and text retrieval, but each also has its own body of literature, theory, praxis, and technologies. more from Wikipedia

Artificial intelligence

Artificial intelligence (AI) is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as “the study and design of intelligent agents” where an intelligent agent is a system that perceives its environment and takes actions that maximize its chances of success. John McCarthy, who coined the term in 1955, defines it as “the science and engineering of making intelligent machines.

Conceptual model

In the most general sense, a model is anything used in any way to represent anything else. Some models are physical objects, for instance, a toy model which may be assembled, and may even be made to work like the object it represents. They are used to help us know and understand the subject matter they represent. The term conceptual model may be used to refer to models which are represented by concepts or related concepts which are formed after a conceptualization process in the mind.

Document

The term document has multiple meanings in ordinary language and in scholarship. WordNet 3.1. lists four meanings (October 2011): document, written document, papers (writing that provides information) document (anything serving as a representation of a person’s thinking by means of symbolic marks) document (a written account of ownership or obligation) text file, document (a computer file that contains text using seven-bit ASCII characters).

Theory

The English word theory was derived from a technical term in philosophy in Ancient Greek. The word theoria, ??????, meant “a looking at, viewing, beholding”, and referring to contemplation or speculation, as opposed to action. Theory is especially often contrasted to “practice” (from Greek praxis, ??????) a Greek term for “doing”, which is opposed to theory because theory involved no doing apart from itself.

Evaluation

Evaluation is a systematic determination of a subject’s merit, worth and significance, using criteria governed by a set of standards. It can assist an organization to ascertain the degree of achievement or value in regards to the aim and objectives of an undertaken project. The primary purpose of evaluation, in addition to gaining insight into prior or existing initiatives, is to enable reflection and assist in the identification of future change.

Cognition

In science, cognition: refers to mental processes. These processes include attention, memory, producing and understanding language, solving problems, and making decisions. Cognition is studied in various disciplines such as psychology, philosophy, linguistics, science and computer science. Usage of the term varies in different disciplines; for example in psychology and cognitive science, it usually refers to an information processing view of an individual’s psychological functions.

Jurisprudence

Jurisprudence is the study and theory of law. Scholars of jurisprudence, or legal theorists (including legal philosophers and social theorists of law), hope to obtain a deeper understanding of the nature of law, of legal reasoning, legal systems and of legal institutions. Modern jurisprudence began in the 18th century and was focused on the first principles of the natural law, civil law, and the law of nations.

List of terms in Legal Information Retrieval

  • abstract entity
  • ad hoc string syntax
  • ad hoc syntax
  • analysis base
  • automatic indexing
  • best match syntax
  • bibliographic coupling
  • bibliography
  • boolean syntax
  • bound term
  • catalog
  • cataloging
  • chain syntax
  • citation index
  • class
  • classification
  • classing
  • clustering
  • co-citation
  • co-extensive headings
  • complex term
  • compound term
  • concrete entity
  • concrete entity and event database
  • data
  • database
  • Databases (along with the systems for access that accompany those in electronic form) can be categorized in many ways: by mission or purpose (such as MIS: management information systems), by subject areas (such as GIS – geographical information systems), by models of organization (such as relational, hypertext, object-oriented, flat-file), or by phenomena represented by data (such as real, concrete entities (things, objects!) and events versus messages about entities and events, including abstract entities, imaginary entities and fictitious events)
  • datum, data
  • descriptive cataloging, descriptive indexing
  • descriptive indexing
  • descriptor
  • digital library
  • direct file
  • displayed index
  • document
  • documentary domain
  • documentary scope
  • documentary unit
  • domain
  • end-user thesaurus
  • entity
  • entry
  • entry array
  • Sometimes, there are separate entries that have been merged to save space and make the display of the index more convenient for the user
  • equivalent term
  • exact match syntax
  • exhaustivity
  • facet
  • Specialized Information Retrieval databases will make use of much more specialized facets
  • facet analysis
  • faceted syntax
  • flat-file database
  • format
  • free-text term
  • full-text database
  • generic posting
  • heading
  • hierarchical specificity
  • hierarchy
  • HTML (HyperText Markup Language)
  • human indexing
  • hypermedia Hypermedia is really hypertext
  • hypertext
  • hypertext database
  • index, indexing
  • indexable matter
  • indexer thesaurus
  • information
  • IR database (information retrieval database)
  • Thus Information Retrieval databases have as their primary purpose the organization of data about messages, texts, and documents to facilitate their retrieval
  • Despite this general focus on the content and features of messages and texts, many Information Retrieval databases must also deal with concrete entities and events related to the creation and transmission of messages and texts
  • In contrast to concrete entity and event databases however, Information Retrieval databases are just as likely to focus on abstract, fictitious or imaginary entities, attributes and events, as compared to real concrete entities and events
  • IR system (information retrieval system or information storage and retrieval system)
  • inverse document frequency
  • inverted file
  • keyword indexing
  • knowledge
  • KWAC index
  • KWIC index
  • KWOC index
  • language model syntax
  • latent semantic indexing
  • literary warrant
  • locator
  • manual indexing
  • medium
  • message
  • metadata
  • natural language syntax
  • NEPHIS
  • non-displayed index
  • object-oriented database
  • ontology
  • operational specificity
  • optical coincidence Information Retrieval system
  • paradigmatic relationship
  • peek-a-boo Information Retrieval system
  • permuted syntax
  • postcoordinate, precoordinate syntax
  • postings
  • postings specificity
  • precision
  • precoordinate syntax
  • probabilistic model syntax
  • pseudo relevance feedback
  • recall
  • The denominator of this formula, the total number of relevant documents in an Information Retrieval database or collection, is impossible to determine
  • record
  • record format
  • reference database
  • relational database
  • relevance
  • relevance feedback
  • rotated term syntax
  • search interface
  • SGML (Standard General Markup Language)
  • specificity
  • standards
  • statement/heading specificity
  • statistical specificity
  • stemming
  • stop list
  • string indexing See string syntax
  • string syntax
  • subject cataloging, subject indexing
  • subject domain
  • subject heading syntax
  • subject indexing
  • subject scope
  • Specialized Information Retrieval databases will have much more specific or narrower categories or facets
  • surrogate/surrogation
  • surrogate display
  • syndetic structure
  • syntactic cross-references
  • syntagmatic relationships
  • syntax
  • taxonomy
  • TEI (Text Encoding Initiative)
  • term
  • text
  • text encoding schema
  • textual database, textbase
  • thesaurus
  • unit of analysis
  • up-posting
  • user warrant
  • vector space model syntax
  • vocabulary control/vocabulary management
  • Syndetic structure (cross references) for equivalent, narrower, broader, and other related terms integrated into browsable alphanumeric displayed indexes
  • Indexing thesauri designed to guide the assignment of terms by indexers
  • End-user thesauri, designed for searchers rather than indexers
  • Co-occurrence term clustering
  • Ontologies
  • weighted term syntax
  • XML (eXtensible Markup Language)

Resources

Further Reading

  • Bashar Al-Shboul , Sung-Hyon Myaeng, Query phrase expansion using wikipedia in patent class search, Proceedings of the 7th Asia conference on Information Retrieval Technology, December 18-20, 2011, Dubai, United Arab Emirates.
  • Blair. D.C. Searching biases in large interactive document retrieval systems. 1. Am. Sm. Inf. Sci. 31 (July 1960), 271-277.
  • Maxwell, K.T., and Schafer, B. (2008). “Concept and Context in Legal Information Retrieval”. Frontiers in Artificial Intelligence and Applications (IOS Press) 189: 63-72.
  • Jackson, P.; et al. (1998). “Information extraction from case law and retrieval of prior cases by partial parsing and query generation”. Conference on Information and Knowledge Management (ACM): 60-67.
  • Sparck Jones, K. Automatic Keyword Classification for Information RP- trieval. Butterworths, London, 1971.
  • Blair, D.C., and Maron, M.E. (1985). “An evaluation of retrieval effectiveness for a full-text document-retrieval”. Communications of the ACM (ACM) 28 (3): 289-299.
  • Swanson, IX. Searching natural language text by computer. Science 132. 3434 (Oct. 1960). 1099-1104.
  • Swanson, D.R. Information retrieval as a trial and error process. Libr. Q. 47, 2 (1976), 1213-148.
  • Swets. J.A. Information retrieval systems. Science 141 (1963), 245- 250.
  • Peters, W.; et al. (2007). “The structuring of legal knowledge in LOIS”. Artificial Intelligence and Law (Springer Netherlands) 15 (2): 117-135.
  • Saravanan, M.; et al. (2007). “Improving legal information retrieval using an ontological framework”. Artificial Intelligence and Law (Springer Netherlands) 17 (2): 101-124.
  • Schweighofer, E. and Liebwald, D. (2007). “Advanced lexical ontologies and hybrid knowledge based systems: First steps to a dynamic legal electronic commentary”. Artificial Intelligence and Law (Springer Netherlands) 15 (2): 103-115.
  • Gelbart, D. and Smith, J.C. (1993). “FLEXICON: an evaluation of a statistical ranking model adapted to intelligent legal text management”. International Conference on Artificial Intelligence and Law (ACM): 142-151.
  • Ashley, K.D. and Bruninghaus, S. (2009). “Automatically classifying case texts and predicting outcomes”. Artificial Intelligence and Law (Springer Netherlands) 17 (2): 125-165.
  • Resnikoff, H.L. The national need for research in information science. ST1 Issues and Options Workshop. House subcommittee on science. research and technology, Washington, D.C. Nov. 3, 1976.
  • Salton, G. Automatic text analysis. Science 168. 3929 (Apr. 1970). 335-343.
  • Rosina Weber, Intelligent jurisprudence research: a new concept, Proceedings of the 7th international conference on Artificial intelligence and law, p.164-172, June 14-17, 1999, Oslo, Norway
  • Saracevic. T. Relevance: A review of and a framework for thinking on the notion in information science. I. Am. Sm. In/, Sri. 26 (19751, 321-343.
  • Zunde. P. and Dexter, M.E. Indexing consistency and quality. Am. Dot. 20, 3 (July 1969). 259-264.
  • Bashar Al-Shboul , Sung-Hyon Myaeng, Wikipedia-based query phrase expansion in patent class search, Information Retrieval, v.17 n.5-6, p.430-451, October 2014

Optical Coincidence Information Retrieval System in Legal Information Retrieval

The following is a basic concept of Optical Coincidence Information Retrieval System in relation to information retrieval. In addition to this, Optical Coincidence Information Retrieval System may be applied to legal texts, including case law, legislation and scholarly works. See peek-a-boo Information Retrieval system.

Peek-A-Boo Information Retrieval System in Legal Information Retrieval

The following is a basic concept of Peek-A-Boo Information Retrieval System in relation to information retrieval. In addition to this, Peek-A-Boo Information Retrieval System may be applied to legal texts, including case law, legislation and scholarly works. Peek-a-boo is the nick name and the more common name for the optical coincidence Information Retrieval system, because light peeking through pin-holes indicates the presence of a hit (one or more documents matching search criteria). This was one of the most prominent pre-computer systems for exact match syntax (boolean) searching. Cards approximately one foot square were used to represent index terms or descriptors for topics or features, including names of authors. After a document was indexed, the cards for each term assigned to the document were pulled from an alphabetical file and the document was recorded on the card by drilling a small hole to represent the document number. On each card was a grid with 100 positions along the horizontal and vertical axes, so that 10,000 unique positions were available to represent 10,000 documents. Each document was given a two part number, corresponding to the horizontal and vertical axes, so that document number 59-23 would get a hole drilled exactly 59 spaces to the right of the left margin and 23 spaces down from the top. A highly calibrated drill press was used to make these holes. See figure 5.1 for photos of a peek-a-boo optical coincidence Information Retrieval system, with equipment manufactured by the Jonker Corporation, circa 1967.


Posted

in

, ,

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *