ODKRIVANJE NOVIH INFORMACIJ V BIBLIOGRAFSKIH ZBIRKAH PODATKOV

Emil Hudomalj

Povzetek

IZVLEČEK

Zbirke podatkov vsebujejo tudi informacije, ki jih z današnjimi sistemi za poizvedovanje običajno ne moremo odkriti. V te namene lahko uporabimo metode za odkrivanje zakonitosti iz zbirk podatkov, ki omogočajo pregledovanje zbirnih podatkov, ugotavljanje trendov, izdelavo sprotnih analiz, iskanje neznanih povezav med podatki ipd. Z njimi uspešno odkrivajo nove informacije na mnogih področij, kot so bančništvo, zavarovalništvo in telekomunikacije, manj pogosto pa se te metode uporabljajo na področju knjižničarstva. V članku je povzet razvoj sistemov za poizvedovanje po bibliografskih zbirkah podatkov z nekaterimi zgodnjimi poskusi uporabe novejših pristopov za odkrivanje zakonitosti iz zbirk podatkov. Sledi opis analitskih zbirk podatkov, ki najpogosteje služijo kot osnova za odkrivanje novih zakonitosti, in opis podatkovnega rudarjenja, ki predstavlja pomemben korak v tem procesu. Poudarjena je vloga knjižničarjev, ki lahko prevzamejo ključno vlogo v procesih izgradnje sistemov za odkrivanje novih informacij v bibliografskih zbirkah podatkov.

Ključne besede

bibliografske zbirke podatkov; analitske zbirke podatkov; podatkovna skladišča; podatkovno rudarjenje

Celotno besedilo:

PDF

Literatura

Adamič, Š., Rožić-Hristovski, A., Hristovski, D., & Dimec, J. (1996). Sistem za podporo pri ocenjevanju uspešnosti raziskovalnega in razvojnega dela v slovenski medicini. Zdravstveni Vestnik, 65(7), 385-387.

Agrawal, R., Gupta, A., & Sarawagi, S. (1997). Modeling multidimensional databases. IEEE, 232-243.

Agrawal, R., Gupta, A., & Sarawagi, S. (1996). Modeling multidimensional databases: research report. Yorktown Heights, New York: IBM Research Division.

Bonifati, A., Cattaneo, F., Ceri, S., Fuggetta, A., & Paraboschi, S. (2001). Designing data marts for data warehouses. ACM Transaction on Software Engineering and Methodology, 10(4), 452-483.

Bradley, P. S., Fayyad, U., & Mangasarian, O. L. (1999). Mathematical programming for data mining: formulations and challenges. INFORMS Journal on Computing, 11(3), 217-238.

Braun, T., Glaenzel, W., & Schubert, A. (1985). Scientometric indicators: evaluation. Singapore: World Scientific Publishing.

Buckland, M. (1999). Library services in theory and context (2nd ed. ed.)Berkeley Digital Library SunSITE.

Buyzdlowski, J. W., Song, I.-Y., & Hassell, L. (1999). A framework for objectoriented on-line analytic processing. DOLAP’98 (pp. 10-15). New York: ACM Press.

Codd, E. F., Codd, S. B., & Salley, C. T. (1993) Providing OLAP to user analysts: An IT mandate (Web Page). URL http://www.hyperion.com/products/whitepapers (2003, January 22).

Egghe, L. (2000). Lectures on Informetrics and Scientometrics. Bangalore, India: Sarada Rangathan Endowement for Library Science.

Fayyad, U., Haussler, D., & Stolorz, P. (1996). Mining scientific data. Communications of the ACM, 39(11), 51-57.

Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI Magazine, 17(3), 37-54.

Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM, 39(11), 27-34.

Fayyad, U., & Stolorz, P. (1997). Data mining and KDD: Promise and challenges. Future Generation Computer System, 13, 99-115.

Finn, R. (1998). Program uncovers hidden connections in the literature. The Scientist, 12(10), 12.

Glaenzel, W. , Schubert, A., & Czerwon, H. J. (1999). A bibliometric analysis of international scientific cooperation of the European Union (1985-1995). Scientometrics, 45(2 ), 185-202.

Glymour, C., Madigan, D., Pregibon, D., & Smyth, P. (1996). Statistical inference and data mining. Communications of the ACM, 39(11), 35-41.

Godin, B. (2003). The emergence of science and technology indicators: why did governments supplement statistic with indicators? Montreal: Project on the history and sociology of S&T statistic: paper no.8.

Han, J., & Kamber, M. (2001). Data mining: concepts and techniques. San Francisco: Morgan Kaufmann Publishers.

Hand, D., Mannila, H., & Smyth, P. (2001). Principles of Data Mining. London: The MIT Press.

Hearts, M. A. (1999). Untangling text data mining (Web Page). URL http://www.sims.berkeley.edu/~hearst/papers/acl99/acl99-tdm.html

Hood, W.W. and Wilson, C.S. (2001). “The literature of bibliometrics, scientometrics, and informetrics”, Scientometrics, vol. 52 no. 2, pp. 291-314.

Hormozi, A. M., & Giles, S. (2004). Data mining: A competitive weapon for banking and retail industries. Information Systems Management.

Hristovski, D., Džeroski, S., & Rožić-Hristovski, A. 2000. Supporting discovery in medicine by association rule mining of bibliographic database. Principles of data mining and knowledge discovery: proceedings of the 4th European conference, PKDD 2000, Lyon, September 13-16, 2000 (pp. 446-451). Springer.

Hristovski, D., Peterlin, B., Mitchell, J., & Humphrey SM. (2003). Improving literature based discovery support by genetic knowledge integration. Stud Health Technol Inform, 95, 68-73.

Hristovski, D., Rožić-Hristovski, A., & Adamič, Š. (1996). A decision support system for biomedical research evaluation. Medical informatics Europe ’96. (Part A,B): human factes in information technologies. Tokyo: IOS Press.

Hristovski, D., Stare, J., Peterlin, B., & Džeroski, S. (2001). Supporting discovery in medicine by association rule mining in Medline and UMLS. MEDINFO 2001 Amsterdam: IOS Press.

Hudomalj, E. (2003). Analiza bibliografskih zbirk podatkov z orodji za sprotno analitsko obdelavo: primer zbirke Biomedicina Slovenica. Ljubljana: Medicinska fakulteta.

Hudomalj, E. , & Vidmar, G. (2003). OLAP and bibliographic databases. Scientometrics, 58 (3), 609-22.

Imielinski, T., & Mannila, H. (1996). A database perspective on knowledge discovery. Communications of the ACM, 39(11), 58-64.

Inmon, W. H. (1996). Building the Data Warehouse. New York: John Wiley & Sons, Inc.

Katz, J. S., & Hicks, D. (1997). Deskop scientometrics. Scientometrics, 38(1), 141-153.

Losiewicz, P., Oard, D. W., & Kostoff, R. N. (2000). Textual data mining to support science and technology management. Journal of Intelligent Information Systems, 15, 99-119.

Ma, C., Chou, D. C., & Yen, D. C. (2000). Data warehousing, technology assessment and management. Industrial Management and Data System, 100(3), 125-134.

Marcum, J. W. (2001). From information center to discovery system: next step for libraries? The Journal of Academic Librarianship, 27(2), 97-106.

Margo, H. (2004). Data-mining algorithms in Oracle9i and Micrisoft SQL Server. Campu Wide Information Systems, 21(3), 132-8.

Mohorič, T. (1995). Uvod v podatkovne baze (1.izd. ed.). Ljubljana: BI-TIM.

Moxon, B. (1996). Defining data mining. DBMS online, (Suppl). URL http://www.dbmsmag.com

Nicholas, D. , & Ritchie, M. (1978). Literature and bibliometrics. London: Clive Bingley.

Nicholson, S. Bibliomining: Data Mining for Libraries (Web Page). Pridobljeno 22.1.2003 s spletne strani: http://www.bibliomining.com/index.html (cit. Nicholson, 2003a).

Nicholson, S. (2003). The Bibliomining Process: Data Warehousing and Data Mining for Library Decision Making. Information Technology and Libraries, 33(4), 146-51. (cit. Nicholson, 2003b).

Nicholson, S. (2003). Introduction to This Special Issue on the Bibliomining Process. Information Technology and Libraries, 33(4), 144. (cit. Nicholson, 2003c).

Nicholson, S., & Stanton, J. (2004) Gaining strategic advantage through bibliomining: data mining for management decisions in corporate, special, digital and traditional libraries (Web Page). Pridobljeno 14.7.2003 s spletne strani: http://biblio.syr.edu/bibliomining/articles/nicholson1.pdf

Nicholson, Scott (2005), “The Basis for Bibliomining: Frameworks for Bringing Together Usage-Based Data Mining and Bibliometrics through Data Warehousing in Digital Library Services”. Pridobljeno 31.5.2005 s spletne strani: http://bibliomining.com/nicholson/nicholsonbibliointro.html

Norton, J. M. (1999). Knowledge discovery in databases. Library Trends, 48(1), 9-21.

Piatetsky-Shapiro, G. (2000). Knowledge discovery in databases: 10 years after. SIGKDD Explorations, 1(2).

Poe, V. (1996). Building a data warehouse for decision support. Upper Saddle River: Prentice Hall PTR.

Qin, J., & Norton, J. M. (1999). Introduction. Library Trends, 48( 1), 1-8.

Sotolongo-Aguilar, G. R., Suarez-Balseiro, C. A., & Guzman-Sanchez, M. V. (2000). Modular bibliometric information system with proprietary software (MOBIS-ProSoft): a versatile approach to bibiliometric research tools. Library and Information Science Electronic Journal, 10(2).

S.-P. T. , & Needamangala, A. (2000). Harvesting information from a library data warehouse. Information Technology and Libraries, 17-28.

Swanson, D. R., & Smalheiser, N. R. (1997). An interactive system for finding complementary literatures: a stimulus to scientific discovery. Artifical Intelligence, 91, 183-203.

Swanson, D. R., & Smalheiser, N. R. (1999). Implicit text linkages between Medline records; using Arrowsmith as an aid to scientific discovery. Library Trends, 48(1), 48-59.

Swanson, D. R. (1986). Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine, 30(1), 7-18.

Thomsen, E. (2002). OLAP solutions: building multidimensional information systems (Second ed.). New York: Wiley Computer Publishing.

Wormell, I. (2000). Informetrics: a new area of quantitative studies. Education for Information, 18(2/3), 131-138.

Wormell, I. (1998). Informetrics: an emerging subdiscipline in information science. Asian Libraries, 7(10), 257-67.