1:- module( pub_graph, 2 [ 3 pub_graph_abstracts/2, 4 pub_graph_cited_by/2, pub_graph_cited_by/3, 5 pub_graph_cites/2, pub_graph_cites/3, 6 pub_graph_cache_open/5, 7 pub_graph_cache_save/4, 8 pub_graph_cited_by_graph/3, 9 pub_graph_cited_by_treadmill/3, 10 pub_graph_summary_info/3, 11 pub_graph_search/2, pub_graph_search/3, 12 pub_graph_summary_display/1, pub_graph_summary_display/2, 13 pub_graph_summary_display/3, 14 pub_graph_summary_display_info/2, 15 pub_graph_table/3, 16 pub_graph_id/2, 17 pub_graph_version/2 18 ] 19 ). 20 21:- use_module(library(sgml)). 22:- use_module(library(apply)). 23:- use_module(library(debug)). 24:- use_module(library(lists)). 25:- use_module(library(ordsets)). % ord_add_element/3. 26:- use_module(library(http/http_open)). % http_open/3 27:- use_module(library(http/json)). % json_read/2,3 28:- use_module(library(http/html_write) ). 29 30% stoics.org.uk libs: 31:- use_module( library(options) ). 32 33:- dynamic pub_graph_cache:cited_by/3. 34:- dynamic pub_graph_cache:info/4. 35:- dynamic pub_graph_cache:info_date/2.
85% This does (no longer ?) work: [try abs_fname/n] 86:- ( catch(use_module(library(prosqlite)),_,fail) -> true 87 ; 88 debug(pub_graph, 'proSQLite not available. Caching through prosqlite disabled', [] ) 89 ). 90 91:- ( catch(use_module(library(odbc)),_,fail) -> true 92 ; 93 debug(pub_graph, 'proSQLite not available. Caching through odbc disabled', [] ) 94 ). 95:- ( catch(use_module(library(db_facts)),_,fail) -> true 96 ; 97 debug(pub_graph, 'pack(db_facts) not available. Caching is disabled', [] ) 98 ). 99 100% Section: defaults, shortcuts. 101 102file_type_extension(csv, csv). 103file_type_extension(prolog, pl). 104file_type_extension(sqlite, sqlite). 105 106url_eutils('https://eutils.ncbi.nlm.nih.gov/entrez/eutils/'). 107url_efetch_pubmed('efetch.fcgi?db=pubmed'). 108 109url_semscholar('http://api.semanticscholar.org/v1/paper/'). 110 111default_names( Names ) :- 112 default_names( ncbi, Names ). 113default_names( semscholar, Names ) :- 114 Names = [arxivId,authors,doi,title,topics,venue,year]. 115 % citationVelocity, citations, influentialCitationCount, paperId, references, url 116default_names( ncbi, Names ) :- 117 Names = ['Author','Title','Source','Pages','PubDate', 118 'Volume','Issue','ISSN','PmcRefCount', 119 'PubType','FullJournalName']. 120 121pub_graph_graph(true). % needed for options_append/3 122pub_graph_graph_defaults( [depth(0),verbose(false),update(true),date(AgesAgo),flat(false)] ) :- 123 a_month_ago( AgesAgo ).
ncbi
(https://www.ncbi.nlm.nih.gov/pubmed/) and semscholar
(http://semanticscholar.org/) are the known IdTypes.
The predicate does not connect to the server, it only type checks the shape of Id.
If Id is an integer or an atom that can be turned to an integer, then IdType is instantiated to ncbi
.
There are three term forms for semscholar
.
The following two ids correspond to the same paper.
?- pub_graph_id( 12075665, Type ). Type = ncbi. ?- pub_graph_id( cbd251a03b1a29a94f7348f4f5c2f830ab80a909, Type ). Type = semscholar.
161pub_graph_id( Id, IdType ) :- 162 ( integer(Id) ; (catch(atom_number(Id,Numb),_,fail),integer(Numb)) ), 163 !, 164 IdType = ncbi. 165pub_graph_id( Term, IdType ) :- 166 ( 167 catch( hex_bytes(Term,_Bytes), _, fail ) 168 ; 169 Term = doi:_ 170 ; 171 Term = arXiv 172 ), 173 !, 174 IdType = semscholar. 175 176% Section: interface predicates
?- pub_graph_version(V,D). V = 1:2:0, D = date(2023, 9, 20).
*/
191pub_graph_version( 1:2:0, date(2023,9,20) ). 192% pub_graph_version( 1:1:0, date(2018,9,23) ). 193% pub_graph_version( 1:0:0, date(2018,9,22) ). 194% pub_graph_version( 0:0:3, date(2012,08,15) ). 195 196pub_graph_search_defaults( Defs ) :- 197 Defs = [ 198 gap(0), 199 retmax(100), 200 quote_value(true), 201 tmp_keep(false), 202 verbose(false) 203 ].
Search in pub_graph for terms in the search term STerm
.
In this, conjunction is marked by , (comma) and
disjunction by ; (semi-column). '-' pair terms are considered as
Key-Value and interpreted as Value[Key] in the query.
List are thought to be flat conjoint search terms with no pair values in them which are
interpreted by pub_graph also as OR operations.
(See example below.)
Known keys are : journal
, pdat
. au
, All Fields
The predicate constructs a query that is posted via the http API provided
by NCBI (http://www.ncbi.nlm.nih.gov/books/NBK25500/).
Options should be a term or list of terms from:
ncbi
terms: Title, Title/Abstract and Affiliation.
The higher the number the looser the match. The default allows for no intervening words, so only
exact sub-matches will be returned (see example: fixme below)
see: https://pubmed.ncbi.nlm.nih.gov/help/#proximity-searchingQTrans
the actual query ran on the
the pub_graph server.Tmp
is variable the file that was used
to receive the results from pub_graph.Keep==true
Verbose == true
then the predicate is verbose about its progress by,
for instance, requesting query is printed on current output stream.?- St = (journal=science,[breast,cancer],pdat=2008), pub_graph_search( St, Ids, [verbose(true),qtranslation(QTrans)] ), length( Ids, Len ), write( number_of:Len ), nl, pub_graph_summary_display( Ids, _, display(all) ). https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&retmax=100&term=science[journal]+AND+breast+cancer+AND+2008[pdat] tmp_file(/tmp/swipl_3884_9) number_of:6 ---- 1:19008416 Author=[Varambally S,Cao Q,Mani RS,Shankar S,Wang X,Ateeq B,Laxman B,Cao X,Jing X,Ramnarayanan K,Brenner JC,Yu J,Kim JH,Han B,Tan P,Kumar-Sinha C,Lonigro RJ,Palanisamy N,Maher CA,Chinnaiyan AM] Title=Genomic loss of microRNA-101 leads to overexpression of histone methyltransferase EZH2 in cancer. Source=Science Pages=1695-9 PubDate=2008 Dec 12 Volume=322 Issue=5908 ISSN=0036-8075 PmcRefCount=352 PubType=Journal Article FullJournalName=Science (New York, N.Y.) ---- 2:18927361 Author=Couzin J Title=Genetics. DNA test for breast cancer risk draws criticism. Source=Science ... ... ... 6:18239125 Author=[Silva JM,Marran K,Parker JS,Silva J,Golding M,Schlabach MR,Elledge SJ,Hannon GJ,Chang K] Title=Profiling essential genes in human mammary cells by multiplex RNAi screening. Source=Science Pages=617-20 PubDate=2008 Feb 1 Volume=319 Issue=5863 ISSN=0036-8075 PmcRefCount=132 PubType=Journal Article FullJournalName=Science (New York, N.Y.) ---- St = (journal=science, [breast, cancer], pdat=2008), Ids = ['19008416', '18927361', '18787170', '18487186', '18239126', '18239125'], QTrans = ['("Science"[Journal] OR "Science (80- )"[Journal] OR "J Zhejiang Univ Sci"[Journal]) AND ("breast neoplasms"[MeSH Terms] OR ("breast"[All Fields] AND "neoplasms"[All Fields]) OR "breast neoplasms"[All Fields] OR ("breast"[All Fields] AND "cancer"[All Fields]) OR "breast cancer"[All Fields]) AND 2008[pdat]'], Len = 6. ?- date(Date), St = (author='Borst Piet'), pub_graph_search( St, Ids, verbose(true) ), length( Ids, Len ), write( number_of:Len ), nl. https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&retmax=100&term=Borst%20Piet\[author\] tmp_file(/tmp/swipl_18703_0) number_of:83 Date = date(2018, 9, 22), St = (author='Borst Piet'), Ids = ['29894693', '29256493', '28821557', '27021571', '26774285', '26530471', '26515061', '25799992', '25662217'|...], Len = 83. ?- date(Date), pub_graph_search( prolog, Ids ), length( Ids, Len ), write( number_of:Len ), nl. number_of:100 Date = date(2018, 9, 22), Ids = ['30089663', '28647861', '28486579', '27684214', '27142769', '25509153', '24995073', '22586414', '22462194'|...], Len = 100. ?- date(Date), pub_graph_search( prolog, Ids, retmax(200) ), length( Ids, Len ), write( number_of:Len ), nl. number_of:127 Date = date(2018, 9, 22), Ids = ['30089663', '28647861', '28486579', '27684214', '27142769', '25509153', '24995073', '22586414', '22462194'|...], Len = 127. ?- St = ('breast','cancer','Publication Type'='Review'), date(Date), pub_graph_search( St, Ids, reldate(30) ), length( Ids, Len ). Date = date(2018, 9, 22), Ids = ['30240898', '30240537', '30240152', '30238542', '30238005', '30237735', '30236642', '30236594', '30234119'|...], Len = 100. ?- pub_graph_summary_display( 30243159, _, true ). ---- 1:30243159 Author=[Wang K,Yee C,Tam S,Drost L,Chan S,Zaki P,Rico V,Ariello K,Dasios M,Lam H,DeAngelis C,Chow E] Title=Prevalence of pain in patients with breast cancer post-treatment: A systematic review. ---- true.
Version 0:3 (pub_graph_version(1:2:0,_D)
).
?- date(Date), pub_graph_search(title='Bayesian networks elucidate', Ids, true), length(Ids,Len). Date = date(2023, 9, 20), Ids = ['35379892'], Len = 1. ?- date(Date), pub_graph_search(title='Bayesian elucidate', Ids, true), length(Ids,Len). Date = date(2023, 9, 20), Ids = [], Len = 0. ?- date(Date), pub_graph_search(title='Bayesian elucidate', Ids, gap(1)), length(Ids, Len), pub_graph_summary_display(Ids, _, true). ---- 1:35379892 Author=[Angelopoulos N,Chatzipli A,Nangalia J,Maura F,Campbell PJ] Title=Bayesian networks elucidate complex genomic landscapes in cancer. ---- Date = date(2023, 9, 20), Ids = ['35379892'], Len = 1. ?- date(D), write('Appears in abstract: "explainable Artificial Intelligence models"'), nl, pub_graph_search('Title/Abstract'='explainable Artificial Intelligence models', Ids, true), pub_graph_summary_display(Ids). 1 ... 10:32417928 Author=[Payrovnaziri SN,Chen Z,Rengifo-Moreno P,Miller T,Bian J,Chen JH,Liu X,He Z] Title=Explainable artificial intelligence models using real-world electronic health record data: a systematic scoping review. ?- date(D), pub_graph_search('Title/Abstract'='explainable Intelligence models', Ids, true). D = date(2023, 9, 20), Ids = []. ?- date(D), pub_graph_search((tiab='explainable Intelligence models',affiliation=sanger), Ids, gap(1)). D = date(2023, 9, 20), Ids = ['35379892'].
Also 0:3 added quote_value(Qv)
. Compare:
?- date(Date), pub_graph_search(title='Bayesian networks elucidate', Ids, true), length(Ids,Len). Date = date(2023, 9, 20), Ids = ['35379892'], Len = 1. ?- date(Date), pub_graph_search(title='Bayesian networks elucidate', Ids, quote_value(false)), length(Ids,Len). Date = date(2023, 9, 20), Ids = ['35923659', '35379892', '32609725', '29055062', '27303742', '26362267'], Len = 6.
427pub_graph_search( STerm, Ids ) :- 428 pub_graph_search( STerm, Ids, [] ). 429 430pub_graph_search( Sterm, Ids, Args ) :- 431 options_append( pub_graph_search, Args, Opts ), 432 url_eutils( Eutils ), 433 ( ground(Sterm) -> true; type_error(ground,Sterm) ), 434 options( gap(Gap), Opts ), 435 options( quote_value(Qv), Opts ), 436 search_term_to_query( Sterm, Gap, Qv, Query ), 437 memberchk( retmax(Ret), Opts ), 438 pub_graph_search_period_opts( '', Period, Opts ), 439 atomic_list_concat( [Eutils,'esearch.fcgi?db=pubmed&retmax=',Ret,Period,'&term=',Query], Url ), 440 memberchk_optional( tmp_file(Tmp), Opts ), 441 memberchk( verbose(Verb), Opts ), 442 true_writes( Verb, Url ), 443 debug( pub_graph, 'Get url is: ~w', Url ), 444 get_url_in_file( Url, Verb, Tmp ), 445 true_writes( Verb, tmp_file(Tmp) ), 446 load_xml_file( Tmp, Xml ), 447 ( (memberchk(qtranslation(QTrans),Opts), 448 QT = 'QueryTranslation', 449 search_element_in_list(Xml,QT,[],element(_,_,QTrans))) -> true; true 450 ), 451 all_subs_in_xml_single( Xml, 'IdList', 'Id', NastyIds ), 452 flatten( NastyIds, Ids ), 453 memberchk_optional( tmp_keep(Keep), Opts ), 454 true_atom_keeps_file( Keep, Tmp ).
pub_graph_summary_display( Ids, _Summary, [] ).
*/
461pub_graph_summary_display( Ids ) :-
462 pub_graph_summary_display( Ids, _Summary, [] ).
pub_graph_summary_display( Ids, Summary, [] ).
*/468pub_graph_summary_display( Ids, Summary ) :- 469 pub_graph_summary_display( Ids, Summary, [] ). 470 471% pub_graph_summary_display_defaults( [display(['Title','Author']),names(Names)] ) :- 472% default_names( Names ). 473pub_graph_summary_display_defaults( [display([title,'Title',authors,'Author'])] ).
Opts
Ids
.
Disp values of var(Disp)
, '*' and 'all', list all available values.?- date(Date), pub_graph_search((programming,'Prolog'), Ids), length( Ids, Len), Ids = [A,B,C|_], pub_graph_summary_display( [A,B,C] ). ---- 1:28486579 Author=[Holmes IH,Mungall CJ] Title=BioMake: a GNU make-compatible utility for declarative workflow management. ---- 2:24995073 Author=[Melioli G,Spenser C,Reggiardo G,Passalacqua G,Compalati E,Rogkakou A,Riccio AM,Di Leo E,Nettis E,Canonica GW] Title=Allergenius, an expert system for the interpretation of allergen microarray results. ---- 3:22215819 Author=[Mørk S,Holmes I] Title=Evaluating bacterial gene-finding HMM structures as probabilistic logic programs. ---- Date = date(2018, 9, 22), Ids = ['28486579', '24995073', '22215819', '21980276', '15360781', '11809317', '9783213', '9293715', '9390313'|...], Len = 43. A = '28486579', B = '24995073', C = '22215819'.
?- pub_graph_summary_display( 30235570, _, display(*) ). ---- 1:30235570 Author=[Morgan CC,Huyck S,Jenkins M,Chen L,Bedding A,Coffey CS,Gaydos B,Wathen JK] Title=Adaptive Design: Results of 2012 Survey on Perception and Use. Source=Ther Innov Regul Sci Pages=473-481 PubDate=2014 Jul Volume=48 Issue=4 ISSN=2168-4790 PmcRefCount=0 PubType=Journal Article FullJournalName=Therapeutic innovation & regulatory science ----
?- pub_graph_cited_by( 20195494, These ), pub_graph_summary_display( These, _, [display(['Title','Author','PubDate'])] ). ---- 1:29975690 Author=[Tang K,Boudreau CG,Brown CM,Khadra A] Title=Paxillin phosphorylation at serine 273 and its effects on Rac, Rho and adhesion dynamics. PubDate=2018 Jul ---- 2:29694862 Author=[McKenzie M,Ha SM,Rammohan A,Radhakrishnan R,Ramakrishnan N] Title=Multivalent Binding of a Ligand-Coated Particle: Role of Shape, Size, and Ligand Heterogeneity. PubDate=2018 Apr 24 ---- 3:29669897 Author=[Padmanabhan P,Goodhill GJ] Title=Axon growth regulation by a bistable molecular switch. PubDate=2018 Apr 25 ... ... 26:20473365 Author=[Welf ES,Haugh JM] Title=Stochastic Dynamics of Membrane Protrusion Mediated by the DOCK180/Rac Pathway in Migrating Cells. PubDate=2010 Mar 1 ---- These = [29975690, 29694862, 29669897, 28752950, 27939309, 27588610, 27276271, 25969948, 25904526|...]. ?- pub_graph_summary_display( 20195494, _Res, true ). ---- 1:20195494 Author=[Cirit M,Krajcovic M,Choi CK,Welf ES,Horwitz AF,Haugh JM] Title=Stochastic model of integrin-mediated signaling and adhesion dynamics at the leading edges of migrating cells. ---- true. ?- pub_graph_summary_display( cbd251a03b1a29a94f7348f4f5c2f830ab80a909, _, display(all) ). ---- 1:cbd251a03b1a29a94f7348f4f5c2f830ab80a909 arxivId=[] authors=[Graham J. L. Kemp,Nicos Angelopoulos,Peter M. D. Gray] doi=10.1109/TITB.2002.1006298 title=Architecture of a mediator for a bioinformatics database federation topics=[] venue=IEEE Transactions on Information Technology in Biomedicine year=2002 ---- true.
*/
593pub_graph_summary_display( IdS, Summary, Args ) :-
594 % pub_graph_summary_display_defaults( Defs ),
595 % options_append( Opts, Defs, All ),
596 non_var_list( IdS, Ids ),
597 options_append( pub_graph_summary_display, Args, Opts ),
598 options( display(Disp), Opts ),
599 pub_graph_summary_info( Ids, Summary, Opts ),
600 pub_graph_summary_display_info( Summary, Disp ).
607pub_graph_summary_display_info( SummaryIn, Disp ) :- 608 pg_en_list( SummaryIn, Summary ), 609 write( '----' ), nl, 610 nth1( N, Summary, Id-Rec ), 611 write( N:Id ), nl, 612 findall( _, (member(D-Val,Rec),once((var(Disp);Disp=all;Disp='*';member(D,Disp))),put(0'\t), write(D=Val),nl), _ ), 613 write( '----' ), nl, fail. 614pub_graph_summary_display_info( _Summary, _Disp ). 615 616% 617% Redirects to pub_graph_cited_by( Id, Ids, [] ). 618% 619pub_graph_cited_by( Id, Ids) :- 620 pub_graph_cited_by( Id, Ids, [] ).
Options is a term option or list of terms from the following;
true
update the cache if you do an explicit retrieval.?- date(D), pub_graph_cited_by( 12075665, By ), length( By, Len ). D = date(2018, 9, 22), By = [25825659, 19497389, 19458771], Len = 3. ?- date(D), pub_graph_cited_by( cbd251a03b1a29a94f7348f4f5c2f830ab80a909, By ), length( By, Len ). D = date(2018, 9, 22), By = ['2e1f686c2357cead711c8db034ff9aa2b7509621', '6f125881788967e1eec87e78b3d2db61d1a8d0ac'|...], Len = 12.
*/
654pub_graph_cited_by( Id, Ids, Args ) :- 655 options_append( pub_graph_graph, Args, Opts ), 656 pub_graph_cited_by_1( Id, Ids, Opts ). 657 658pub_graph_cited_by_1( Id, Ids, Opts ) :- 659 ( ( memberchk(cache(Type,Handle,Date,_Upd), Opts), 660 pub_graph_date_cached(Type,Handle,cited_by,Date,Id,Ids) ) -> 661 ( memberchk(verbose(true),Opts) -> 662 write( got_from_cache(Id,Ids) ), nl 663 ; 664 true 665 ) 666 ; 667 pub_graph_cited_by_uncached( Id, Ids, Opts ), 668 ( (memberchk(cache(Type,Handle,_Date,Upd),Opts),Upd==true) -> 669 pub_graph_update_cache( Type, Handle, cited_by, Id, Ids ) 670 ; 671 true 672 ) 673 ). 674 675pub_graph_cited_by_uncached( Id, Ids, Opts ) :- 676 pub_graph_id( Id, IdType ), 677 pub_graph_cited_by_uncached( IdType, Id, Ids, Opts ). 678 679pub_graph_cited_by_uncached( semscholar, Id, Ids, _Verb ) :- 680 semscholar_id_json( Id, Json ), 681 memberchk( citations=Citations, Json ), 682 findall( ACit, ( member(json(Sub),Citations), 683 member(paperId=ACit,Sub), 684 ACit \== '' 685 ), 686 Ids ). 687pub_graph_cited_by_uncached( ncbi, Id, Ids, Opts ) :- 688 ( memberchk(verbose(Verb),Opts) -> true; Verb = false ), 689 url_eutils( Eutils ), 690 Query = 'elink.fcgi?report=xml&mode=text&tool=curl&db=pubmed&cmd=neighbor&linkname=pubmed_pubmed_citedin&id=', 691 atomic_list_concat( [Eutils,Query,Id], Url ), 692 get_url_in_file( Url, Verb, Tmp ), 693 ( pub_graph_cited_by_parse_file(Tmp, Ids) -> 694 true 695 ; 696 Ids = [] 697 ), 698 delete_file( Tmp ).
id_has_journal( Id, Jounal )
succeeds (det.)706pub_graph_id_journal( IdPrv, _Provisional, Journal ) :- 707 current_predicate( user:id_has_journal/2 ), 708 to_number( IdPrv, Id ), 709 user:id_has_journal( Id, Journal ), 710 !. 711pub_graph_id_journal( _Id, Journal, Journal ). 712 713% 714% Redirects to pub_graph_cites( Id, Ids, [] ). 715% 716pub_graph_cites( Id, Ids ) :- 717 pub_graph_cites( Id, Ids, [] ).
Options is a term option or list of terms from the following;
?- date(D), pub_graph_cites( 20195494, Ids ), length( Ids, Len ), write( D:Len ), nl. date(2018,9,22):38 D = date(2018, 9, 22), Ids = ['19160484', '19118212', '18955554', '18800171', '18586481'|...], Len = 38. % pubmed does not have references cited by the following paper: ?- date(D), pub_graph_cites( 12075665, Ids ), length( Ids, Len ), write( D:Len ), nl. false. % whereas, semanticscholar.org finds 17 (non '') of the 21: ?- date(D), pub_graph_cites( cbd251a03b1a29a94f7348f4f5c2f830ab80a909, Ids ), length( Ids, Len ), write( D:Len ), nl. date(2018,9,22):17 D = date(2018, 9, 22), Ids = ['6477792829dd059c7d318927858d307347c54c2e', '1448901572d1afd0019c86c42288108a94f1fb25', |...], Len = 17. ?- pub_graph_summary_display( 12075665, Results, true ). ---- 1:12075665 Author=[Kemp GJ,Angelopoulos N,Gray PM] Title=Architecture of a mediator for a bioinformatics database federation. ---- Results = [12075665-['Author'-['Kemp GJ', 'Angelopoulos N', 'Gray PM'], ... - ...|...]].
775pub_graph_cites( Id, Ids, OptsIn ) :- 776 non_var_list( OptsIn, Opts ), 777 pub_graph_id( Id, IdType ), 778 ( memberchk(verbose(Verb),Opts) -> true; Verb = false ), 779 pub_graph_id_type_cites( IdType, Id, Ids, Verb ). 780 781pub_graph_id_type_cites( ncbi, Id, Ids, Verb ) :- 782 url_eutils( Eutils ), 783 % Query = 'elink.fcgi?report=xml&mode=text&tool=curl&db=pmc&DbFrom=pubmed&Cmd=link&linkname=pubmed_pmc_refs&id=', 784 Query = 'elink.fcgi?report=xml&mode=text&tool=curl&db=pubmed&Cmd=neighbor&linkname=pubmed_pubmed_refs&id=', 785 atomic_list_concat( [Eutils,Query,Id], Url ), 786 get_url_in_file( Url, Verb, Tmp ), 787 load_xml_file( Tmp, Xml ), 788 once( search_element_in_list( Xml, 'LinkSetDb', [], element(_,_,LXml) ) ), 789 findall( CId, search_element_in_list(LXml,'Id',[],element(_,_,[CId])),Ids ), 790 delete_file( Tmp ). 791pub_graph_id_type_cites( semscholar, Id, Ids, _Verb ) :- 792 semscholar_id_json( Id, Json ), 793 memberchk( references=Citations, Json ), 794 findall( ACit, ( member(json(Sub),Citations), 795 member(paperId=ACit,Sub), 796 ACit \== '' 797 ), 798 Ids ). 799 800pub_graph_table_defaults( Defs ) :- 801 NoSearch = 'No search term available', 802 Defs = [ include_if(false), output(csv), stem([]), spy([]), search(NoSearch) ].
Can include journal impact factor if jif/6 is provided.
Output rows contain #citing, [IF ,] Date, Journal, Title, Author, (Title urled to pubmed/$id)
Opts
has(Val)
,quite(Val)
]?- pub_graph_table
840pub_graph_table( Ids, Rows, Args ) :- 841 options_append( pub_graph_table, Args, Opts ), 842 OTerms = [include_if(Iif),missing_if(Mif),output(Form),stem(Stem),spy(SpY)], 843 options( OTerms, Opts ), 844 pg_en_list( SpY, Spy ), 845 % ground( Iif ), % i think options checks. 846 maplist( pub_graph_table_id_row(Iif,Mif,Spy), Ids, Rows ), 847 options( search(Search), Opts ), 848 pub_graph_table_output( Stem, Form, Iif, Search, Rows ). 849 850pub_graph_table_id_row( Iif, Mif, Spy, Id, Row ) :- 851 ( memberchk(Id,Spy) -> trace; true ), 852 pub_graph_table_id_main_args( Id, Journal, Args ), 853 pub_graph_table_iif( Iif, Mif, Args, Journal, Irgs ), 854 Row =.. [row|Irgs]. 855 856pub_graph_table_id_main_args( Id, Journal, Args ) :- 857 pub_graph_table_id_main_args_all( Id, Journal, Args ), 858 !. 859pub_graph_table_id_main_args( Id, Journal, Args ) :- 860 throw( could_not_find_all_info_for_table_row_of(Id,Journal,Args) ). 861 862pub_graph_table_id_main_args_all( Id, Journal, Args ) :- 863 pub_graph_summary_info( Id, Summ, [] ), 864 memberchk( 'PmcRefCount'-PRCount, Summ ), 865 memberchk( 'FullJournalName'-JournalPrv, Summ ), 866 memberchk( 'Title'-Title, Summ ), 867 memberchk( 'Author'-AuthorS, Summ ), 868 pg_en_list( AuthorS, Authors ), 869 atomic_list_concat( Authors, ', ', Author ), 870 memberchk( 'PubDate'-Pdate, Summ ), 871 pub_graph_id_journal( Id, JournalPrv, JournalPrv2 ), 872 ( pub_graph_journal_synonym(JournalPrv2,Journal) -> true; Journal = JournalPrv2 ), 873 Args = [PRCount,Pdate,Journal,Title,Author,Id]. 874 875pub_graph_table_iif( false, _Mif, Args, _J, Args ). 876pub_graph_table_iif( true, Mif, [A|Rgs], J, [A,Jif|Rgs] ) :- 877 downcase_atom( J, DownJ ), 878 pub_graph_table_jif( DownJ, J, Mif, Jif ). 879 880pub_graph_table_jif( DownJ, _J, _Mif, Jif ) :- 881 user:jif( _, DownJ, _, JifGot, _, _ ), 882 ( number(JifGot) -> JifGot = Jif; Jif is 0 ), 883 !. 884pub_graph_table_jif( DownJ, _J, _Mif, Jif ) :- 885 pub_graph_jif_alternative( DownJ, AlteJ ), 886 user:jif( _, AlteJ, _, JifGot, _, _ ), 887 ( number(JifGot) -> JifGot = Jif; Jif is 0 ), 888 !. 889pub_graph_table_jif( DownJ, _J, _Mif, Jif ) :- 890 missing_jif( DownJ, Jif ), 891 !. 892pub_graph_table_jif( _DownJ, J, Mif, Jif ) :- 893 pub_graph_table_jif_missing( Mif, J, Jif ), 894 !. % i dont think the above leaves b.points 895 896pub_graph_table_jif_missing( has(Val), J, Val ) :- 897 debug( pub_graph, 'No IF for: ~w', J ). 898pub_graph_table_jif_missing( quiet(Val), _J, Val ). 899pub_graph_table_jif_missing( throw, J, _Jif ) :- 900 throw( no_jif_for_journal(J) ). 901 902pub_graph_jif_alternative( Jname, Alternative ) :- 903 member( Mfx, [', ',' : ',' and ','. '] ), 904 atom_length( Mfx, Len ), 905 sub_atom( Jname, Bef, Len, Aft, Mfx ), 906 sub_atom( Jname, 0, Bef, _, Pfx ), 907 ToAft is Bef + Len, 908 sub_atom( Jname, ToAft, Aft, 0, Psf ), 909 member( Nfx, ['-',' ',' & '] ), 910 atomic_list_concat( [Pfx,Nfx,Psf], Alternative ). 911pub_graph_jif_alternative( Jname, Alternative ) :- 912 member( Bp, [' : ','; '] ), 913 atom_length( Bp, Len ), 914 sub_atom( Jname, Bef, Len, _Aft, ' : ' ), 915 sub_atom( Jname, 0, Bef, _, Alternative ). 916pub_graph_jif_alternative( Jname, Alternative ) :- 917 atom_concat( 'the ', Alternative, Jname ). 918 919missing_jif( metabolites, 0 ). 920missing_jif( 'genomics, proteomics & bioinformatics', 0 ). 921missing_jif( 'frontiers in pharmacology', 0 ). % http://www.frontiersin.org/news/Frontiers_Impact_Factors_2013/875 922missing_jif( 'advances in experimental medicine and biology', 2.012 ). % http://www.springer.com/series/5584 923missing_jif( 'expert review of clinical pharmacology', -2 ). % don't think this has IF in general 924missing_jif( 'dental research journal', -2 ). % don't think this has IF in general 925missing_jif( 'brain and nerve = shinkei kenkyu no shinpo', -2 ). % only in Japanese 926missing_jif( 'critical reviews in biomedical engineering', -2 ). % discontinued i think 927missing_jif( 'yao xue xue bao = acta pharmaceutica sinica', -2 ). % in Chinese 928missing_jif( 'gan to kagaku ryoho. cancer & chemotherapy', -2 ). % in Japanese 929missing_jif( 'chinese journal of cancer', 0 ). % http://www.researchgate.net/journal/1944-446X_Chinese_journal_of_cancer 930missing_jif( 'seminars in oncology nursing', -2 ). % http://www.researchgate.net/journal/1878-3449_Seminars_in_Oncology_Nursing 931missing_jif( 'critical reviews in oncogenesis', -2 ). % http://www.researchgate.net/journal/0893-9675_Critical_reviews_in_oncogenesis 932missing_jif( 'praxis', -2 ). % in German 933 934 935pub_graph_journal_synonym( 'Cellular and molecular life sciences : CMLS', 'CELLULAR AND MOLECULAR LIFE SCIENCES' ). 936pub_graph_journal_synonym( 'Tumour biology : the journal of the International Society for Oncodevelopmental Biology and Medicine', 'TUMOR BIOLOGY' ). % http://www.ncbi.nlm.nih.gov/nlmcatalog/8409922 937pub_graph_journal_synonym( 'American journal of physiology. Heart and circulatory physiology', 'AMERICAN JOURNAL OF PHYSIOLOGY-HEART AND CIRCULATORY PHYSIOLOGY' ). % http://www.ncbi.nlm.nih.gov/nlmcatalog/8409922 938pub_graph_journal_synonym( 'The journals of gerontology. Series A, Biological sciences and medical sciences', 'JOURNALS OF GERONTOLOGY SERIES A-BIOLOGICAL SCIENCES AND MEDICAL SCIENCES' ). 939pub_graph_journal_synonym( 'Proteomics. Clinical applications', 'Proteomics Clinical applications' ). 940pub_graph_journal_synonym( 'Future oncology (London, England)', 'Future Oncology' ). 941pub_graph_journal_synonym( 'The oncologist', 'Oncologist' ). 942pub_graph_journal_synonym( 'Cancer control : journal of the Moffitt Cancer Center', 'Cancer Control' ). 943pub_graph_journal_synonym( 'Breast (Edinburgh, Scotland)', 'Breast' ). 944pub_graph_journal_synonym( 'Cancer journal (Sudbury, Mass.)', 'Cancer Journal' ). 945pub_graph_journal_synonym( 'Hematology / the Education Program of the American Society of Hematology. American Society of Hematology. Education Program', 'Hematology-American Society of Hematology Education Program' ). 946 947pub_graph_table_output( [], _Form, _Iif, _Search, _Rows ) :- !. 948pub_graph_table_output( Stem, Form, Iif, Search, Rows ) :- 949 file_name_extension( Stem, Form, OutF ), 950 pub_graph_table_file_output( Form, OutF, Iif, Search, Rows ), 951 !. 952pub_graph_table_output( _Stem, Form, _Iif, _Search, _Rows ) :- 953 write( user_error, 'Dont know how to write onto: ~w', [Form] ), nl( user_error ). % fixme 954 955pub_graph_table_file_output( html, OutF, _Iif, Search, Rows ) :- 956 tell( OutF ), 957 maplist( pub_graph_table_html_row, Rows, HtmlRows ), 958 term_string( Search, SearchString ), 959 atom_string( SearchAtom, SearchString ), 960 atom_concat( 'search term: ', SearchAtom, HtmlH1 ), 961 reply_html_page( title(HtmlH1), [h1(HtmlH1),table(HtmlRows)] ). 962pub_graph_table_file_output( csv, OutF, Iif, _Search, Rows ) :- 963 pub_graph_iif_header( Iif, Hdr ), 964 csv_write_file( OutF, [Hdr|Rows] ). 965 966pub_graph_table_html_row( row(C,I,P,J,T,A,B), Html ) :- 967 atomic_list_concat( ['https://www.ncbi.nlm.nih.gov/pubmed/?term=',B], Href ), 968 Anc = a(href(Href),T), 969 atomic_list_concat( PParts, ' ', P ), 970 atomic_list_concat( PParts, '_', U ), 971 Html = tr( [ 972 td(C), 973 td(I), 974 td(U), 975 td(J), 976 td(Anc), 977 td(A), 978 td(B) 979 ] ). 980 981pub_graph_iif_header( true, Hdr ) :- 982 Hdr = row('Cited','IF','Published','Journal','Title','Author','Pubmed'). 983pub_graph_iif_header( false, Hdr ) :- 984 Hdr = row('Cited','Published','Journal','Title','Author','Pubmed').
id(s)
IdS.Options is a single term, or list of the following terms:
true
be verbose.false
if you dont want the cache to be updated with newly downloaded information.?- date(Date), Opts = names(['Author','PmcRefCount','Title']), pub_graph_summary_info( 12075665, Results, Opts ), write( date:Date ), nl, member( R, Results ), write( R ), nl, fail. date:date(2018,9,22) Author-[Kemp GJ,Angelopoulos N,Gray PM] PmcRefCount-3 Title-Architecture of a mediator for a bioinformatics database federation. false. ?- pub_graph_summary_info(12075665,Res,[]), member(R,Res), write( R ), nl, fail. Author-[Kemp GJ,Angelopoulos N,Gray PM] Title-Architecture of a mediator for a bioinformatics database federation. Source-IEEE Trans Inf Technol Biomed Pages-116-22 PubDate-2002 Jun Volume-6 Issue-2 ISSN-1089-7771 PmcRefCount-3 PubType-Journal Article FullJournalName-IEEE transactions on information technology in biomedicine : a publication of the IEEE Engineering in Medicine and Biology Society ?- pub_graph_summary_info( cbd251a03b1a29a94f7348f4f5c2f830ab80a909, Res, true ), member( R, Res ), write( R ), nl, fail. arxivId-[] authors-[Graham J. L. Kemp,Nicos Angelopoulos,Peter M. D. Gray] doi-10.1109/TITB.2002.1006298 title-Architecture of a mediator for a bioinformatics database federation topics-[] venue-IEEE Transactions on Information Technology in Biomedicine year-2002 false.
*/
1064pub_graph_summary_info( IdS, Results, OptS ) :- 1065 non_var_list( OptS, Opts ), 1066 non_var_list( IdS, Ids ), 1067 !, 1068 maplist( pub_graph_summary_info_single(Opts), Ids, UnsResults ), 1069 ( memberchk(sort_by(SortField,Order,Nums),Opts) -> 1070 pub_graph_summary_sort( UnsResults, SortField, Order, Nums, OrdResults ) 1071 ; 1072 UnsResults=OrdResults 1073 ), 1074 de_kv_list_on( OrdResults, IdS, Results ). 1075pub_graph_summary_info( Id, Results, OptS ) :- 1076 non_var_list( OptS, Opts ), 1077 pub_graph_summary_info_single( Opts, Id, Id-Results ). 1078 1079pub_graph_summary_info_single( Opts, Id, Id-Results ) :- 1080 memberchk( cache(Type,Handle,_Update), Opts ), 1081 pub_graph_summary_info_cached( Type, Handle, Id, Results, Opts ), 1082 debug( pub_graph, 'Summary from cache for: ~w', Id ), 1083 !. 1084pub_graph_summary_info_single( Opts, Id, Id-Info ) :- 1085 % fixme use _defaults 1086 debug( pub_graph, 'Summary not in cache for: ~w', Id ), 1087 pub_graph_id( Id, IdType ), 1088 ( memberchk(names(Names),Opts) -> true; default_names(IdType,Names) ), 1089 ( memberchk(tmp_file(Tmp),Opts) -> true; true ), 1090 ( memberchk(cache(Type,Handle,true),Opts) -> 1091 summary_info( Tmp, Id, all, ResAll, Opts ) , 1092 findall( Name-Val, (member(Name,Names),memberchk(Name-Val,ResAll)), Info ), 1093 pub_graph_summary_info_cached_update( Type, Handle, Id, ResAll ) 1094 ; 1095 summary_info( Tmp, Id, Names, Info, Opts) 1096 ). 1097 1098pub_graph_summary_info_cached( csv, Handle, Id, Results, Opts ) :- 1099 pub_graph_summary_info_cached( in_mem, Handle, Id, Results, Opts ). 1100pub_graph_summary_info_cached( prolog, Handle, Id, Results, Opts ) :- 1101 pub_graph_summary_info_cached( in_mem, Handle, Id, Results, Opts ). 1102pub_graph_summary_info_cached( sqlite, Handle, Id, Results, Opts ) :- 1103 pub_graph_summary_info_cached( db_facts, Handle, Id, Results, Opts ). 1104pub_graph_summary_info_cached( odbc, Handle, Id, Results, Opts ) :- 1105 pub_graph_summary_info_cached( db_facts, Handle, Id, Results, Opts ). 1106pub_graph_summary_info_cached( db_facts, Handle, Id, Results, Opts ) :- 1107 ( memberchk(names(Names),Opts) -> true; default_names(Names) ), 1108 findall(K-(O-V),(member(K,Names),db_holds(Handle,info(Id,K,O,V))), OPRes), 1109 OPRes \== [], 1110 keysort( OPRes, OPRord ), 1111 % kv_decompose_vs( OPRord, PRord ), 1112 nest_pair_flatten_removes( OPRord, PRord ), 1113 PRord = [K1-V1|PRTail], 1114 kvs_to_unique_k_v_as_list( PRTail, K1, [V1], Results ). 1115pub_graph_summary_info_cached( in_mem, _Handle, Id, Results, Opts ) :- 1116 ( memberchk(names(Names),Opts) -> true; default_names(Names) ), 1117 findall( K-V, (member(K,Names),pub_graph_cache_info:info(Id,K,V)), PRes ), 1118 PRes \== [], 1119 keysort( PRes, PRord ), 1120 PRord = [K1-V1|PRTail], 1121 kvs_to_unique_k_v_as_list( PRTail, K1, [V1], Results ). 1122 1123pub_graph_summary_info_cached_update( csv, Alias, Id, KVs ) :- 1124 pub_graph_summary_info_cached_update( in_mem, Alias, Id, KVs ). 1125pub_graph_summary_info_cached_update( prolog, Alias, Id, KVs ) :- 1126 pub_graph_summary_info_cached_update( in_mem, Alias, Id, KVs ). 1127pub_graph_summary_info_cached_update( in_mem, Alias, Id, KVs ) :- 1128 retractall( info(Id,_K,_O,_V) ), 1129 retractall( info_date(Id,_) ), 1130 date( Date ), 1131 assert( info_date(Id,Date) ), 1132 findall( _, ( member(K-V,KVs), (is_list(V)->nth1(N,V,Ve);Ve=V,N=1),assert(Alias, info(Id,K,N,Ve),_) ), _ ). 1133pub_graph_summary_info_cached_update( sqlite, Alias, Id, KVs ) :- 1134 pub_graph_summary_info_cached_update( db_facts, Alias, Id, KVs ). 1135pub_graph_summary_info_cached_update( odbc, Alias, Id, KVs ) :- 1136 pub_graph_summary_info_cached_update( db_facts, Alias, Id, KVs ). 1137pub_graph_summary_info_cached_update( db_facts, Alias, Id, KVs ) :- 1138 db_retractall( Alias, info(Id,_K,_O,_V), _Aff1 ), 1139 db_retractall( Alias, info_date(Id,_), _Aff2 ), 1140 date( Date ), 1141 db_date_sql_atom( Date, SqlDate ), 1142 db_assert( Alias, info_date(Id,SqlDate), _ ), 1143 findall( _, ( member(K-V,KVs), (is_list(V)->nth1(N,V,Ve);Ve=V,N=1),db_assert(Alias, info(Id,K,N,Ve),_) ), _ ).
?- pub_graph_abstracts( 24939894, Abs ). Abs = ['Lemur tyrosine kinase 3 (LMTK3) is associated with cell proliferation and',...].
1159pub_graph_abstracts( IdS, Abstracts ) :- 1160 non_var_list( IdS, Ids ), 1161 pub_graph_get_abstracts( Ids, Ibs ), 1162 de_kv_list_on( Ibs, IdS, Abstracts ). 1163 1164pub_graph_get_abstracts( Ids, Ibs ) :- 1165 atomic_list_concat( Ids, ',', IdsAtom ), 1166 atomic_list_concat( [id,IdsAtom], '=', IdsReq ), 1167 url_eutils( Eutils ), 1168 url_efetch_pubmed( Efetch ), 1169 atom_concat( Eutils, Efetch, EEReq ), 1170 RmdReq = 'retmode=text', 1171 RtpReq = 'rettype=abstract', 1172 atomic_list_concat( [EEReq,IdsReq,RmdReq,RtpReq], '&', Url ), 1173 get_url_in_file( Url, true, Tmp ), 1174 % write( file_is(Tmp) ), nl, 1175 file_abstracts( Tmp, Ibs ). 1176 1177file_abstracts( File, Ibs ) :- 1178 read_file_to_codes( File, Codes, [] ), 1179 Codes = [0'\n|Tcodes], 1180 codes_abstracts( Tcodes, Ibs ). 1181 1182% Each abstract-entry has sections seperated by \n. 1183codes_abstracts( Codes, Ibs ) :- 1184 % write( codes(Codes) ), nl, 1185 codes_abstracts_sections( Sections, [], Codes, [] ), 1186 % length( Sections, Len ), 1187 % maplist( write_codes_ln, Sections ), 1188 % write( sections_length(Len) ), nl, 1189 code_sections_abstracts( Sections, Ibs ). 1190 1191write_codes_ln( Codes ) :- 1192 atom_codes( Atom, Codes ), 1193 write( Atom ), nl. 1194 1195codes_abstracts_sections( Sects, Acc ) --> 1196 "\n\n", consume_new_line, 1197 !, 1198 { ( Acc==[] -> Sects=TSects; 1199 reverse(Acc,Sect),Sects=[Sect|TSects] 1200 %, atom_codes( Att, Sect), write( att_acc(Att,Acc) ), nl 1201 ) }, 1202 codes_abstracts_sections( TSects, [] ). 1203codes_abstracts_sections( Sects, Acc ) --> 1204 [C], 1205 !, 1206 codes_abstracts_sections( Sects, [C|Acc] ). 1207codes_abstracts_sections( Sects, Acc ) --> 1208 { ( Acc == [] -> Sects = [] 1209 ; reverse( Acc, Sect ), 1210 % atom_codes( SectAtom, Sect ), 1211 % write( section(SectAtom,Acc) ), nl, 1212 Sects = [Sect] ) }. 1213 1214consume_new_line --> "\n", {!}. 1215consume_new_line --> []. 1216 1217code_sections_abstracts( [], [] ). 1218code_sections_abstracts( [_A,_B,_C,_D,_Comm,E,_Copy,F|T], [Id-Elns|Ibs] ) :- 1219 atom_codes( Fat, F ), 1220 debug( pub_graph, 'Parsing abstract from line:~w', Fat ), 1221 code_section_pub_med_id( Id, F, _ ), 1222 debug( pub_graph, 'Parsing abstract for id:~w', Id ), 1223 code_section_lines( E, Elns ), 1224 !, 1225 code_sections_abstracts( T, Ibs ). 1226code_sections_abstracts( [_Ref,_Ttl,_Ath,_Ainf,Abs,Copy,Pmi|T], [Id-AbsLns|Ibs] ) :- 1227 copyright_line( Copy, _ ), 1228 atom_codes( Pat, Pmi ), 1229 debug( pub_graph, 'Parsing abstract from line:~w', Pat ), 1230 code_section_pub_med_id( Id, Pmi, _ ), 1231 debug( pub_graph, 'Parsing abstract for id:~w', Id ), 1232 code_section_lines( Abs, AbsLns ), 1233 !, 1234 code_sections_abstracts( T, Ibs ). 1235code_sections_abstracts( [_Ref,_Ttl,_Ath,_Ainf,_Cmnt,Abs,Pmi|T], [Id-AbsLns|Ibs] ) :- 1236 atom_codes( Pat, Pmi ), 1237 debug( pub_graph, 'Parsing abstract from line:~w', Pat ), 1238 code_section_pub_med_id( Id, Pmi, _ ), 1239 !, 1240 debug( pub_graph, 'Parsing abstract for id:~w', Id ), 1241 code_section_lines( Abs, AbsLns ), 1242 code_sections_abstracts( T, Ibs ). 1243code_sections_abstracts( [_Ref,_Ttl,_Ath,_Anf,Abs,Pmi|T], [Id-AbsLns|Ibs] ) :- 1244 atom_codes( Pat, Pmi ), 1245 debug( pub_graph, 'Parsing abstract from line:~w', Pat ), 1246 code_section_pub_med_id( Id, Pmi, _ ), 1247 !, 1248 debug( pub_graph, 'Parsing abstract for id:~w', Id ), 1249 code_section_lines( Abs, AbsLns ), 1250 code_sections_abstracts( T, Ibs ). 1251code_sections_abstracts( [_Ref,_Ttl,_Ath,Pmi|T], [Id-[]|Ibs] ) :- 1252 atom_codes( Pat, Pmi ), 1253 debug( pub_graph, 'Parsing abstract from line:~w', Pat ), 1254 code_section_pub_med_id( Id, Pmi, _ ), 1255 debug( pub_graph, 'Parsing abstract for id:~w', Id ), 1256 !, 1257 code_sections_abstracts( T, Ibs ). 1258code_sections_abstracts( [_A,_B,_C,_D,F|T], [Id-Elns|Ibs] ) :- 1259 atom_codes( Fat, F ), 1260 debug( pub_graph, 'Parsing non-abstract from line:~w', Fat ), 1261 code_section_pub_med_id( Id, F, _ ), 1262 Elns = [], 1263 code_sections_abstracts( T, Ibs ). 1264 1265/* 1266code_sections_abstracts( [_Ref,_Ttl,_Ath,Abs,Copy,Pmi|T], [Id-AbsLns|Ibs] ) :- 1267 copyright_line( Copy, _ ), 1268 atom_codes( Pat, Pmi ), 1269 debug( pub_graph, 'Parsing abstract from line:~w', Pat ), 1270 code_section_pub_med_id( Id, Pmi, _ ), 1271 !, 1272 debug( pub_graph, 'Parsing abstract for id:~w', Id ), 1273 code_section_lines( Abs, AbsLns ), 1274 code_sections_abstracts( T, Ibs ). 1275code_sections_abstracts( [_Ref,_Ttl,_Ath,_Anf,Abs,Pmi|T], [Id-AbsLns|Ibs] ) :- 1276 atom_codes( Pat, Pmi ), 1277 debug( pub_graph, 'Parsing abstract from line:~w', Pat ), 1278 code_section_pub_med_id( Id, Pmi, _ ), 1279 !, 1280 debug( pub_graph, 'Parsing abstract for id:~w', Id ), 1281 code_section_lines( Abs, AbsLns ), 1282 code_sections_abstracts( T, Ibs ). 1283code_sections_abstracts( [_Ref,_Ttl,_Ath,Pmi|T], [Id-[]|Ibs] ) :- 1284 atom_codes( Pat, Pmi ), 1285 debug( pub_graph, 'Parsing abstract from line:~w', Pat ), 1286 code_section_pub_med_id( Id, Pmi, _ ), 1287 !, 1288 debug( pub_graph, 'Parsing abstract for id:~w', Id ), 1289 code_sections_abstracts( T, Ibs ). 1290code_sections_abstracts( [_A,_B,_C,_D,F|T], [Id-Elns|Ibs] ) :- 1291 atom_codes( Fat, F ), 1292 debug( pub_graph, 'Parsing non-abstract from line:~w', Fat ), 1293 code_section_pub_med_id( Id, F, _ ), 1294 Elns = [], 1295 code_sections_abstracts( T, Ibs ). 1296 */ 1297 1298code_section_lines( E, Elns ) :- 1299 code_section_code_lines( EcLns, [], E, [] ), 1300 maplist( atom_codes, Elns, EcLns ). 1301 1302code_section_code_lines( Lns, Acc ) --> 1303 "\n", 1304 !, 1305 { Acc == [] -> Lns = TLns; reverse( Acc, Ln ), Lns = [Ln|TLns] }, 1306 code_section_code_lines( TLns, [] ). 1307code_section_code_lines( Lns, Acc ) --> 1308 [C], 1309 !, 1310 code_section_code_lines( Lns, [C|Acc] ). 1311code_section_code_lines( Lns, Acc ) --> 1312 { Acc == [] -> Lns = []; reverse(Acc,Ln), Lns = [Ln] }. 1313 1314copyright_line --> 1315 "Copyright", !. 1316copyright_line --> 1317 "©", !. 1318 1319code_section_pub_med_id( Id ) --> 1320 "PMID: ", 1321 !, 1322 code_section_in_pub_med_id( IdCs ), 1323 {number_codes(Id,IdCs)}. 1324code_section_pub_med_id( Id ) --> 1325 [_C], 1326 code_section_pub_med_id( Id ). 1327 1328code_section_in_pub_med_id( [C|Cs] ) --> 1329 [C], 1330 { 0'0 =< C, C =< 0'9}, 1331 !, 1332 code_section_in_pub_med_id( Cs ). 1333code_section_in_pub_med_id( [] ) --> []. 1334 1335pub_graph_summary_sort( Summ, By, Ord, Nums, Sorted ) :- 1336 summary_items_have( Summ, By, Nums, Have, Havenots ), 1337 sort( Have, Sorto ), 1338 ( Ord == asc -> Sorto = Ordo; reverse(Sorto,Ordo) ), 1339 kv_decompose_vs( Ordo, Disc ), 1340 % disconnect_double_minus( Ordo, Disc ), 1341 append( Disc, Havenots, Sorted ). 1342 1343disconnect_double_minus( [], [] ). 1344disconnect_double_minus( [_K1-K-V|T], [K-V|M] ) :- 1345 disconnect_double_minus( T, M ). 1346 1347summary_items_have( [], _By, _Nums, [], [] ). 1348summary_items_have( [K-List|T], By, Nums, Haves, Havenots ) :- 1349 ( memberchk(By-ByItem,List) -> 1350 ( Nums == true -> 1351 to_number( ByItem, ByItemNum ), 1352 Haves = [ByItemNum-(K-List)|THaves], 1353 Havenots = THavenots 1354 ; 1355 Haves = [ByItem-(K-List)|THaves], 1356 Havenots = THavenots 1357 ) 1358 ; 1359 Haves = THaves, 1360 Havenots = [K-List|THavenots] 1361 ), 1362 summary_items_have( T, By, Nums, THaves, THavenots ).
Options is a single term, or list of the following terms:
Type == false
or absent to turn caching offtrue
Type is one of csv,prolog,sqlite
and odbc
. In the first 3 cases, Object should be a filename
and for odbc
it should be a DSN token. In the case of filenames, the default value for Object
is formed as, <type>_<id1>{_<id2>}.<type_ext>.
<type_ext> is either set to Ext or if this is missing it is deduced from Type. It can be set to ''
if you want no extension added.
Graph is compatible with the graph representation of Prolog unweighted graphs. That is, all vertices should appear in a keysorted list as V-Ns pairs, where V is the vertex and Ns is the sorted list of all its neighbours. Ns is the empty list if V has no neighbours, although this should only be the case here, if one of the input Ids has no citing papers or for the nodes at the edge of Depth.
?- pub_graph_cited_by_graph( 12075665, G, cache(sqlite) ).
*/
1407pub_graph_cited_by_graph( IdS, Graph, Args ) :- 1408 % non_var_list( OptS, Opts ), 1409 % pub_graph_graph_defaults( Defs ), 1410 options_append( pub_graph_graph, Args, Opts ), 1411 options( depth(Depth), Opts ), 1412 ( Depth =:= 0 -> Ds = inf, De = 0; integer(Depth), Ds = bound, De = Depth ), 1413 non_var_list( IdS, Ids ), 1414 maplist( kv_elem_set(0), Ids, IDs ), 1415 options( verbose(Pgs), Opts ), 1416 opts_list_opens_cby_cache( Opts, Ids, COpts ), % maybe listed(Opts) instead of All ? 1417 pub_graph_cited_by_graph_depth( IDs, [], Ds, De, Pgs, [], COpts, Orph, Graph1 ), 1418 opts_list_closes_cby_cache( COpts ), 1419 sort( Orph, Overts ), 1420 keysort( Graph1, GraphOrd ), 1421 add_vertices( GraphOrd, Overts, Graph ). 1422 1423opts_list_opens_cby_cache( Opts, Ids, Copts ) :- 1424 ( (memberchk(cache(Type),Opts),\+Type==false) -> 1425 ( memberchk(ext(Ext),Opts) -> true ; file_type_extension(Type,Ext) ), 1426 ( memberchk(location(Loc),Opts) -> 1427 ext_file( Loc, Ext, Obj ) 1428 ; 1429 ( Ids = [A,B|_] -> IList = [A,B]; Ids = [A|_], IList = [A] ), 1430 atomic_list_concat( [Type|IList], '_', FStem ), 1431 file_name_extension( FStem, Ext, Obj ) 1432 ), 1433 memberchk( date(Date), Opts ), 1434 memberchk( update(Upd), Opts ), 1435 pub_graph_cache_open( Type, Obj, cited_by, Handle, Opts ), 1436 Copts = [cache(Type,Handle,Date,Upd)|Opts] 1437 ; 1438 Copts = Opts 1439 ). 1440 1441opts_list_closes_cby_cache( COpts ) :- 1442 memberchk( cache(Type,Handle,_,_) ,COpts ), 1443 !, 1444 pub_graph_cache_save( Type, Handle, cited_by/3, [] ). 1445opts_list_closes_cby_cache( _ ). 1446 1447pub_graph_cited_by_treadmill_defaults( Defs ) :- 1448 Defs = [ depth(5), 1449 ext(pl), 1450 file(graph_treadmilling), 1451 single_file(false) 1452 ].
Options is a single term or list of the following:
file(File)
file to use for storagesingle_file(Single)
boolean value, def. is true
.
false
seperate (aggregating) files are created
at each iteration
depth(D)
the overall depth limit1471pub_graph_cited_by_treadmill( Ids, Graph, Args ) :- 1472 % pub_graph_treadmill_defaults( Defs ), 1473 options_append( pub_graph_cited_by_treadmill, Args, Opts ), 1474 options( file(File), Opts ), 1475 options( single_file(Single), Opts ), 1476 options( depth(D), Opts ), 1477 options( ext(Ext), Opts ), 1478 ext_file( File, Ext, PlFile ), 1479 ( exists_file(PlFile) -> 1480 consult( pub_graph_cache:PlFile ) 1481 ; 1482 true 1483 ), 1484 Oterm = topts(D,File,Single), 1485 pub_graph_cited_by_treadmill_deepening( 1, Oterm, Ids, _, Graph ). 1486 1487 1488pub_graph_cited_by_treadmill_deepening( Di, topts(Dlim,_,_), _Ids, Graph, Graph ) :- 1489 Di > Dlim, 1490 !, 1491 % write( overall_depth_limit_reached(Dlim) ), nl. 1492 true. 1493pub_graph_cited_by_treadmill_deepening( Di, Topts, Ids, _Current, Graph ) :- 1494 Topts = topts(Dlim,File,Single), 1495 Opts = [verbose(true),cache(prolog,_,date(2012,09,01),true),depth(Di),verbose(true)], 1496 pub_graph_cited_by_graph( Ids, GraphI, Opts ), 1497 % length( GraphI, LenGI ), 1498 % write( population_at_depth(Di,LenGI) ), nl, 1499 pub_graph_treadmill_file_name( Single, File, Di, Dlim, Fname ), 1500 % write( onto_file(Fname) ), nl, 1501 pub_graph_cache_save( prolog, Fname, cited_by/3, [] ), 1502 Dstar is Di + 1, 1503 pub_graph_cited_by_treadmill_deepening( Dstar, Topts, Ids, GraphI, Graph ).
csv,prolog,odbc
and sqlite
files are recognised.
The former two are consulted into module pub_graph_cache
, and Handle is therofore not used.
For odbc/sqlite
files the lookups and database access is via the odbc and prosqlite libraries respectively.
Handle can be named to an alias of choise, otherwise a opaque atom is returned with which the db is accessed.
Which, should either be cited_by
or info
.
Options is a term or list of terms from:
ext(Ext)
extension to try on the file. Use the empty atom if you do not want the library to
use the default extension for the type of cache used.
Options are also passed to the underlying open operations for the type chosen. So for instance
you can provide the username and passward for the odbc connection with user(U)
and password(P)
.
*/
1524pub_graph_cache_open( csv, FileIn, Which, Handle, Opts ) :- 1525 memberchk( Which, [cited_by,info] ), 1526 % atom_concat( pub_graph_cache_, Which, Module ), 1527 pub_graph_cache_open_csv_file( FileIn, pub_graph_cache, Which, Handle, Opts ). 1528pub_graph_cache_open( prolog, FileIn, Which, Handle, Opts ) :- 1529 memberchk( Which, [cited_by,info] ), 1530 % atom_concat( pub_graph_cache_, Which, Module ), 1531 pub_graph_cache_open_pl_file( FileIn, pub_graph_cache, Handle, Opts ). 1532pub_graph_cache_open( sqlite, FileIn, Which, Alias, Opts ) :- 1533 memberchk( Which, [cited_by,info] ), 1534 pub_graph_cache_open_sqlite( FileIn, Alias, Which, Opts ). 1535pub_graph_cache_open( odbc, Dsn, _Which, Alias, Opts ) :- 1536 ( db_current_connection(Alias,Type) -> 1537 Fail = failed_to_open_odbc_on_existing_alias(Alias,Type,Dsn), 1538 write( user_error, Fail ), nl( user_error ), fail 1539 ; 1540 Defs = [alias(Alias),user(pub_graph),password(pmed123)], 1541 append( Opts, Defs, All ), 1542 odbc_connect( Dsn, Alias, All ) 1543 ). 1544 1545pub_graph_cache_open_csv_file( FileIn, Module, Pname, CbyF, Opts ) :- 1546 append( Opts, [ext(csv)], All ), 1547 options( ext(Ext), All ), 1548 ext_file( FileIn, Ext, CbyF ), 1549 (exists_file(CbyF) -> 1550 csv_read_file( CbyF, Rows, [functor(Pname)] ), 1551 ( memberchk(flat_input(IFlat),Opts) -> true; memberchk(flat(IFlat),Opts) ), 1552 ( IFlat==true -> 1553 sort(Rows,Ord), 1554 ord_facts_aggregate_arg(Ord,3,Agr) 1555 ; 1556 maplist( csv_row_to_nest_row, Rows, Agr ) 1557 ), 1558 maplist( assert_at(Module), Agr ) 1559 ; 1560 true 1561 ). 1562pub_graph_cache_open_pl_file( FileIn, Module, CbyF, Opts ) :- 1563 append( Opts, [ext(pl)], All ), 1564 memberchk( ext(Ext), All ), 1565 ext_file( FileIn, Ext, CbyF ), 1566 (exists_file(CbyF) -> call( Module:consult(CbyF) ) ; true ). 1567 1568pub_graph_cache_open_sqlite( FileIn, Alias, Type, Opts ) :- 1569 append( Opts, [ext(sqlite)], All ), 1570 memberchk( ext(Ext), All ), 1571 ext_file( FileIn, Ext, SqliteF ), 1572 ( exists_file(SqliteF) -> 1573 true 1574 ; 1575 atomic_list_concat( ['Creating pub_graph',Type,'SQLite database at'], ' ', Create ), 1576 write( Create:SqliteF ), nl, 1577 sqlite_connect( SqliteF, HandleC, exists(false) ), 1578 findall( _, 1579 ( pub_graph_create_sqlite_type_statement( Type, Statement ), 1580 prosqlite:sqlite_query( HandleC, Statement, _Row ) 1581 ), 1582 _ ), 1583 1584 sqlite_disconnect( HandleC ) % fixme: make sure prosqlite closes with handles as well as with aliases. 1585 ), 1586 % atom_concat( pub_graph_cache_, Type, Mod ), 1587 Mod = pub_graph_cache, 1588 ( var(Alias) -> Alias = HandleD ; Ao = [alias(Alias)] ), 1589 OptsD = [as_predicates(true),at_module(Mod)|Ao], 1590 sqlite_connect( SqliteF, HandleD, OptsD ). 1591 1592pub_graph_create_sqlite_type_statement( cited_by, 'CREATE TABLE cited_by (pubmed_id bigint(20), ret_date date, citer bigint(20), Primary Key (pubmed_id,citer) );' ). 1593pub_graph_create_sqlite_type_statement( info, 'CREATE TABLE info_date (pubmed_id bigint(20), rec_date date, Primary Key (pubmed_id) );' ). 1594pub_graph_create_sqlite_type_statement( info, 'CREATE TABLE info (pubmed_id bigint(20), key_name tinytext, nth integer, key_value smalltext, Primary Key (pubmed_id,key_name,nth) );' ).
Opts a term or list of terms from:
1607pub_graph_cache_save( prolog, FileIn, WhatIf, Opts ) :- 1608 pub_graph_cache_save( in_mem, prolog, FileIn, WhatIf, Opts ). 1609pub_graph_cache_save( csv, FileIn, WhatIf, Opts ) :- 1610 pub_graph_cache_save_in_mem( csv, FileIn, WhatIf, Opts ). 1611pub_graph_cache_save( sqlite, Handle, _, _Opts ) :- 1612 sqlite_disconnect( Handle ). 1613pub_graph_cache_save( odbc, Handle, _, _Opts ) :- 1614 odbc_disconnect( Handle ). 1615pub_graph_cache_save( db_facts, Handle, _, Opts ) :- 1616 db_current_connection( Handle, Type ), 1617 Type \== db_facts, 1618 pub_graph_cache_save( Type, Handle, _, Opts ). 1619 1620pub_graph_cache_save_in_mem( Which, File, WhatIf, Opts ) :- 1621 pg_en_list( WhatIf, What ), 1622 /* ( file_name_extension(_,'',FileIn) -> 1623 file_name_extension( FileIn, pl, File ) ; File = FileIn), 1624 */ 1625 ( (memberchk(flat(true),Opts),Which=_Something/3) -> Flat = true; Flat = false ), 1626 open( File, write, Out ), 1627 pub_graph_cache_dump( Which, What, Out, Flat ), 1628 close( Out ), 1629 forall( member(W,What), (W=Pname/Arity,functor(G,Pname,Arity),retractall(pub_graph_cache:)) ). 1630 1631pub_graph_cache_dump( Which, What, Out, Flat ) :- 1632 member( Name/Arity, What ), 1633 functor( Goal, Name, Arity ), 1634 once(pl_cache_predicate_module(Name,Arity,Module) ), 1635 forall( call(Module:Goal), record_in_mem_clause(Which,Flat,Out,Goal) ). 1636 1637record_in_mem_clause( Which, Flat, Out, Fact ) :- 1638 ( Flat == true -> 1639 arg( 3, Fact, Third ), 1640 ( Third == [] -> 1641 arg( 1, Fact, First ), arg( 2, Fact, Second ), 1642 functor( Fact, Name, 3 ), 1643 functor( New, Name, 3 ), 1644 arg( 1, New, First ), arg( 2, New, Second ), arg( 3, New, 0 ) 1645 ; 1646 ( is_list(Third) -> 1647 arg( 1, Fact, First ), arg( 2, Fact, Second ), 1648 functor( Fact, Name, 3 ), 1649 functor( New, Name, 3 ), 1650 arg( 1, New, First ), arg( 2, New, Second ), 1651 forall( member(Elem,Third), (arg(3,New,Elem),record_fact(Which,Out,New)) ) 1652 ; 1653 record_fact( Which, Out, Fact ) 1654 ) 1655 ) 1656 ; 1657 record_fact( Which, Out, Fact ) 1658 ). 1659 /* wouldn't be cool to have portray clause/2 write on csvs automatically ? 1660 then we need open_csv to register the csv files 1661 */ 1662record_fact( prolog, Out, Fact ) :- 1663 portray_clause( Out, Fact ). 1664record_fact( csv, Out, Fact ) :- 1665 % not-checked ! 1666 % see if we can get away with the [] 1667 mapargs( pl_term_csv_atom, Fact, Csv ), 1668 csv_write_stream( Out, [Csv], [] ). 1669 1670pl_cache_predicate_module( cited_by, 3, pub_graph_cache ). 1671pl_cache_predicate_module( info, 3, pub_graph_cache ). 1672pl_cache_predicate_module( info_date, 2, pub_graph_cache ). 1673 1674% Section: non-interface predicates... 1675 1676pub_graph_cited_by_graph_depth( [], _Seen, _Ds, _De, _Pgs, Graph, _Opts, [], Graph ). 1677pub_graph_cited_by_graph_depth( [Id-D|T], Seen, Ds, De, Pgs, Acc, Opts, Orph, Graph ) :- 1678 ( Pgs==true -> length(T,Len), 1679 writeq(id(Id,Len)), write('.'), nl 1680 ; true ), 1681 ( ord_memberchk(Id,Seen) -> 1682 Pairs = T, 1683 NxSeen = Seen, 1684 Nxt = Acc, 1685 Orph = TOrph 1686 ; 1687 pub_graph_cited_by( Id, IdAnc, Opts ), 1688 ( Pgs==true -> 1689 write( has(Id,IdAnc) ), write( '.' ), nl 1690 ; 1691 true 1692 ), 1693 D1 is D + 1, 1694 ( depth_limit_reached(Ds,De,D1) -> 1695 Pairs = T, 1696 NxSeen = Seen, 1697 Nxt = Acc, 1698 Orph = [Id|TOrph] 1699 ; 1700 ord_add_element( Seen, Id, NxSeen ), 1701 maplist( kv_elem_set(D1), IdAnc, New ), 1702 append( T, New, Pairs ), 1703 sort( IdAnc, AncSort ), 1704 Nxt = [Id-AncSort|Acc], 1705 Orph = TOrph 1706 ) 1707 ), 1708 pub_graph_cited_by_graph_depth( Pairs, NxSeen, Ds, De, Pgs, Nxt, Opts, TOrph, Graph ). 1709 1710depth_limit_reached( bound, Lim, Val ) :- 1711 Lim < Val. 1712 1713kv_elem_set( Val, Key, Key-Val ). 1714 1715summary_info( Tmp, Id, WhichIn, Results, OptS ) :- 1716 pub_graph_id( Id, IdType ), 1717 non_var_list( OptS, Opts ), 1718 ( WhichIn == all -> default_names(IdType,Which) ; pg_en_list(WhichIn,Which) ), 1719 summary_info( IdType, Tmp, Id, Which, Results, Opts ). 1720 1721summary_info( ncbi, Tmp, Id, Which, Results, Opts ) :- 1722 url_eutils( Eutils ), 1723 ( memberchk(retmax(RMax),Opts) -> true; RMax = 100 ), 1724 atomic_list_concat( ['esummary.fcgi?report=xml&mode=text&tool=wget&retmax=',RMax,'&db=pubmed&id='], Query ), 1725 atomic_list_concat( [Eutils,Query,Id], Url ), 1726 get_url_in_file( Url, false, Tmp ), % fixme 1727 load_xml_file( Tmp, Xml ), 1728 Elem = element(_,_,[Entry]), 1729 findall( Name-Info, ( 1730 member( Name, Which ), 1731 findall( Entry, 1732 search_element_in_list(Xml, 'Item', ['Name'=Name], Elem ), 1733 PInfo ), 1734 ( PInfo = [Info] -> true; PInfo = Info ) 1735 ), Results ), 1736 ( memberchk(tmp_keep(true),Opts) -> true; delete_file(Tmp) ). 1737summary_info( semscholar, _Tmp, Id, Which, Results, _Opts ) :- 1738 semscholar_id_json( Id, Json ), 1739 findall( Name-Info, ( member(Name,Which), 1740 member(Name=InfoPrv,Json), 1741 json_value(InfoPrv,Info) 1742 ), 1743 Results 1744 ). 1745 1746pub_graph_date_cached( prolog, Explicit, Default, Cutoff, Id, Return ) :- 1747 pub_graph_date_cached( in_mem, Explicit, Default, Cutoff, Id, Return ). 1748pub_graph_date_cached( csv, Explicit, Default, Cutoff, Id, Return ) :- 1749 pub_graph_date_cached( in_mem, Explicit, Default, Cutoff, Id, Return ). 1750pub_graph_date_cached( in_mem, _Handle, Pname, Cutoff, Id, Return ) :- 1751 % ( var(Explicit) -> Explicit=Default; true ), 1752 G =.. [Pname,Id,Date,Return], 1753 once(pl_cache_predicate_module(Pname,3,Module) ), 1754 call( Module:G ), 1755 Cutoff @=< Date. 1756pub_graph_date_cached( sqlite, Handle, Pname, Cutoff, Id, Return ) :- 1757 pub_graph_date_cached( db_facts, Handle, Pname, Cutoff, Id, Return ). 1758pub_graph_date_cached( odbc, Handle, Pname, Cutoff, Id, Return ) :- 1759 pub_graph_date_cached( db_facts, Handle, Pname, Cutoff, Id, Return ). 1760pub_graph_date_cached( db_facts, Handle, Pname, Cutoff, Id, Return ) :- 1761 G =.. [Pname,Id,SQLDate,R], 1762 findall( R, db_holds(Handle,G), PrvReturn ), 1763 PrvReturn \== [], 1764 once( db_holds(Handle,G) ), 1765 ( atom(SQLDate) -> 1766 db_date_sql_atom( Date, SQLDate ) 1767 ; 1768 % odbc translates this, so should sqlite 1769 Date = SQLDate 1770 ), 1771 1772 ( PrvReturn = [0] -> Return = [] ; Return = PrvReturn ), 1773 Cutoff @=< Date. 1774 1775/* commenting this out probably makes as_predicates(true) obsolete 1776 G =.. [PredName,Id,SQLDate,R], 1777 once(pl_cache_predicate_module(PredName,3,Module) ), 1778 findall( R, call(Module:G), PrvReturn ), 1779 PrvReturn \== [], 1780 once( Module:G ), 1781 date_sql_atom( Date, SQLDate ), 1782 ( PrvReturn = [0] -> Return = [] ; Return = PrvReturn ), 1783 Cutoff @=< Date. 1784 */ 1785 1786pub_graph_update_cache( csv, Explicit, Default, Id, Return ) :- 1787 pub_graph_update_cache( in_mem, Explicit, Default, Id, Return ). 1788pub_graph_update_cache( prolog, Explicit, Default, Id, Return ) :- 1789 pubned_update_cache( in_mem, Explicit, Default, Id, Return ). 1790pub_graph_update_cache( in_mem, _Handle, Pname, Id, Return ) :- 1791 % ( var(Explicit) -> Explicit=Default; true ), 1792 date(Date), 1793 R =.. [Pname,Id,_,_], 1794 retractall( pub_graph_cache: ), 1795 G =.. [Pname,Id,Date,Return], 1796 assert( pub_graph_cache: ). 1797 1798pub_graph_update_cache( sqlite, Handle, PredName, Id, PrvReturn ) :- 1799 pub_graph_update_cache( db_facts, Handle, PredName, Id, PrvReturn ). 1800pub_graph_update_cache( odbc, Handle, PredName, Id, PrvReturn ) :- 1801 pub_graph_update_cache( db_facts, Handle, PredName, Id, PrvReturn ). 1802pub_graph_update_cache( db_facts, Handle, PredName, Id, PrvReturn ) :- 1803 % ( var(Explicit) -> Explicit=Default; true ), 1804 date(Date), 1805 R =.. [PredName,Id,_,_], 1806 db_retractall( Handle, R, _ ), 1807 db_date_sql_atom( Date, SqlDate ), 1808 G =.. [PredName,Id,SqlDate,Ret], 1809 ( PrvReturn = [] -> Return = [0]; Return = PrvReturn ), 1810 findall( Ret, (member(Ret,Return),db_assert(Handle,G,_)), _ ). 1811 1812pub_graph_cited_by_parse_file(Tmp, Ids) :- 1813 load_xml_file( Tmp, Xml ), 1814 once( search_element_in_list( Xml, 'LinkSetDb', [], element(_,_,LXml) ) ), 1815 findall( CId, 1816 ( search_element_in_list(LXml,'Id',[],element(_,_,[CIdA])), 1817 atom_codes(CIdA,CIdCs), number_codes(CId,CIdCs) 1818 ), 1819 Ids ). 1820 1821% Section: auxiliaries
pub_graph_elink_date_option(DateKey)
holds.
?- pub_graph_search_period_opts( '', Period, [reldate(60)] ). Period = 'reldate=60'. ?- pub_graph_search_period_opts( '', Period, [mindate(2014),mindate(2015)] ). Period = 'mindate=2014&mindate=2015'. ?- pub_graph_search_period_opts( '', Period, [mindate(2014),maxdate(2015)] ). Period = 'mindate=2014&maxdate=2015'.
1842pub_graph_search_period_opts( Pfx, Period, Opts ) :- 1843 select( Opt, Opts, Rpts ), 1844 functor( Opt, Oname, 1 ), 1845 pub_graph_elink_date_option( Oname ), 1846 arg( 1, Opt, Arg ), 1847 !, 1848 atomic_list_concat( [Oname,Arg], '=', Atom ), 1849 atomic_list_concat( [Pfx,Atom], '&', Rfx ), 1850 pub_graph_search_period_opts( Rfx, Period, Rpts ). 1851pub_graph_search_period_opts( Period, Period, _Opts ). 1852 1853pub_graph_elink_date_option( reldate ). 1854pub_graph_elink_date_option( mindate ). 1855pub_graph_elink_date_option( maxdate ).
Gap is an integer, used for fields that take approximate search (0 is strictest for allowing 0 interving words)
Qv is the quote_value(Qv)
option value.
1865search_term_to_query( (A,B), Gap, Qv, Query ) :- 1866 !, 1867 search_term_to_query( A, Gap, Qv, Aq ), 1868 search_term_to_query( B, Gap, Qv, Bq ), 1869 atomic_list_concat( [Aq,'+AND+',Bq], Query ). 1870search_term_to_query( (A;B), Gap, Qv, Query ) :- 1871 !, 1872 search_term_to_query( A, Gap, Qv, Aq ), 1873 search_term_to_query( B, Gap, Qv, Bq ), 1874 atomic_list_concat( [Aq,'OR',Bq], '+', Query ). 1875search_term_to_query( (A=B), Gap, Qv, Query ) :- 1876 !, 1877 maplist( quote_curl_atom, [A,B], [Aq,Bq] ), 1878 search_term_proximity_field( A, Gap, Pxf ), 1879 ( Qv == false -> % true is defaulty 1880 atomic_list_concat( [Bq,'[',Aq,Pxf,']'], Query ) 1881 ; 1882 atomic_list_concat( ['%22',Bq,'%22','[',Aq,Pxf,']'], Query ) 1883 ). 1884search_term_to_query( C, _Gap, _Qv, Query ) :- 1885 pg_en_list( C, Clist ), 1886 maplist( quote_curl_atom, Clist, Qlist ), 1887 atomic_list_concat( Qlist, '+', Query ). 1888 1889search_term_proximity_field( A, Gap, Pxf ) :- 1890 search_term_proximity_field( A ), 1891 !, 1892 atomic_list_concat( [':~',Gap], Pxf ). 1893search_term_proximity_field( _A, _Gap, '' ). 1894 1895search_term_proximity_field('Affiliation'). 1896search_term_proximity_field(affiliation). 1897search_term_proximity_field('Title'). 1898search_term_proximity_field(title). 1899search_term_proximity_field('Title/Abstract'). 1900search_term_proximity_field('title/abstract'). 1901% abbreviations 1902search_term_proximity_field(ti). % Title 1903search_term_proximity_field(tiab). % Title/Abstract 1904search_term_proximity_field(ad). % affiliation ? 1905 1906% very basic. for now we are just translating space to %20 1907% we should use some SWI http internals here 1908% 1909quote_curl_atom( In, Out ) :- 1910 atom_codes( In, InCs ), 1911 maplist( quote_curl_code, InCs, NestOutCs ), 1912 flatten( NestOutCs, OutCs ), 1913 atom_codes( Out, OutCs ). 1914 1915quote_curl_code( 0' , [0'%,0'2,0'0] ) :- !. 1916quote_curl_code( 0'", [0'%,0'2,0'2] ) :- !. 1917% quote_curl_code( 0' , [0'+] ) :- !. 1918quote_curl_code( Code, Code ).
1925memberchk_optional( Elem, List ) :- 1926 memberchk( Elem, List ), 1927 !. 1928memberchk_optional( _Elem, _List ). 1929 1930true_atom_keeps_file( Keep, _File ) :- 1931 Keep == true, 1932 !. 1933true_atom_keeps_file( _Keep, File ) :- 1934 delete_file( File ). 1935 1936/* 1937to_list( Either, List ) :- 1938 ( (var(Either);(Either\=[_H|_T],Either\==[]) ) -> 1939 List = [Either] 1940 ; 1941 List = Either 1942 ). 1943 */ 1944 1945true_writes( true, Report ) :- 1946 !, 1947 write( Report ), nl. 1948true_writes( _Opts, _Report ).
1954get_url_in_file(Url, _Verb, File) :- 1955 ( var(File) -> tmp_file_stream(text, File, Stream), close(Stream) ; true), 1956 debug( pub_grapsh, 'Downloading URL: ~p, onto file: ~p', [Url,File] ), 1957 setup_call_cleanup( 1958 http_open(Url, In, 1959 [ % cert_verify_hook(ssl_verify) % checkme: 1960 ]), 1961 setup_call_cleanup( 1962 open(File, write, Out, [type(binary)]), 1963 copy_stream_data(In, Out), 1964 close(Out)), 1965 close(In) 1966 ). 1967 1968% this was the old version: currently not called from anywhere 18.09.22 1969get_url_in_file(curl, URL, Verb, File) :- 1970 ( var(File) -> tmp_file_stream(text, File, Stream), close(Stream) ; true), 1971 ( Verb==true -> Args = ['-o',File,URL] ; Args = ['-s','-o',File,URL] ), 1972 % true_writes( Verb, process_create(path(curl),Args,[]) ), 1973 debug( pub_graph, 'Getting url via curl with args:~w', [Args] ), 1974 % fixme: use url_file/2 1975 process_create( path(curl), Args, [] ), 1976 exists_file( File ). 1977 1978/* 1979http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=science[journal]+AND+breast+cancer+AND+2008[pdat] 1980 1981 */ 1982 1983all_subs_in_xml_single( Xml, Single, SubSel, Subs ) :- 1984 once( search_element_in_list( Xml, Single, [], element(_,_,Nest) ) ), 1985 findall( Sub, 1986 search_element_in_list(Nest,SubSel,[],element(_,_,Sub)), 1987 Subs ).
hostip.pl
.1994search_element_in_list([Content|MoreContent], Name, ListAttributes, Element) :- 1995 ( search_element(Content, Name, ListAttributes, Element) 1996 ; search_element_in_list(MoreContent, Name, ListAttributes, Element) 1997 ). 1998 1999search_element(HTML, Name, ListAttributes, HTML) :- 2000 compound(HTML), 2001 arg(1, HTML, Name), 2002 arg(2, HTML, HTML_Attributi), 2003 forall(member(Attribute, ListAttributes), 2004 memberchk(Attribute, HTML_Attributi)). 2005search_element(HTML, Name, ListAttributes, Element) :- 2006 compound(HTML), 2007 arg(3, HTML, Contents), 2008 search_element_in_list(Contents, Name, ListAttributes, Element). 2009 2010/* 2011to_atom_ids( In, Ids ) :- 2012 to_list( In, List ), 2013 maplist( to_atom_id, List, Ids ). 2014 2015to_atom_id( Either, Atom ) :- 2016 ( number(Either) -> number_codes( Either, Codes ), atom_codes( Atom, Codes ) 2017 ; Atom = Either ). 2018 */ 2019 2020pub_graph_treadmill_file_name( true, File, _Di, _Dl, File ). 2021pub_graph_treadmill_file_name( false, File, Di, Dl, Full ) :- 2022 number_codes( Dl, DLcs ), 2023 number_codes( Di, DIcs ), 2024 reverse( DLcs, DLcsR ), 2025 reverse( DIcs, DIcsR ), 2026 pad_codes( DLcsR, DIcsR, 0'0, DFcsR ), 2027 reverse( DFcsR, DFcs ), 2028 atom_codes( DF, DFcs ), 2029 atomic_list_concat( [File,DF], '_d', Full ). 2030 2031/* 2032options_append( Opts, Defs, All ) :- 2033 to_list( Opts, OptsList ), 2034 append( OptsList, Defs, All ). 2035 */ 2036 2037ext_file( FileIn, Ext, File ) :- 2038 ( file_name_extension(_,Ext,FileIn) -> 2039 File = FileIn 2040 ; 2041 file_name_extension(FileIn,Ext,File) 2042 ). 2043 2044pad_codes( [], [], _Code, [] ) :- !. 2045pad_codes( [_H|T], [], Code, [Code|M] ) :- !, 2046 pad_codes( T, [], Code, M ). 2047pad_codes( [_H|T], [F|R], Code, [F|M] ) :- !, 2048 pad_codes( T, R, Code, M ). 2049 2050kvs_to_unique_k_v_as_list( [], K, PrvVs, [K-V] ) :- 2051 ( PrvVs = [V] -> true ; reverse(PrvVs,V) ). 2052kvs_to_unique_k_v_as_list( [K1-V1|T], K, Vs, KVs ) :- 2053 ( K == K1 -> 2054 Vs2 = [V1|Vs], 2055 TKVs = KVs 2056 ; 2057 Vs2 = [V1], 2058 ( Vs=[VofK] -> true; reverse(Vs,VofK) ), 2059 KVs = [K-VofK|TKVs] 2060 ), 2061 kvs_to_unique_k_v_as_list( T, K1, Vs2, TKVs ). 2062 2063ord_facts_aggregate_arg( [H|T], N, Agr ) :- 2064 functor( H, Pname, Arity ), 2065 findall( On, (between(1,Arity,On),On =\= N), Ons ), 2066 maplist( term_n_arg(H), Ons, Hons ), 2067 ord_facts_aggregate_arg( T, Ons, N, Pname/Arity, Hons, [], Agr ). 2068 2069ord_facts_aggregate_arg( [], Ons, N, Pn/Ar, Hons, HAgrs, [Fact] ) :- 2070 de_singleton( HAgrs, Agr ), 2071 functor( Fact, Pn, Ar ), 2072 maplist( csv_atom_pl_term, Hons, Cons), 2073 maplist( term_n_arg(Fact), [N|Ons], [Agr|Cons] ). 2074ord_facts_aggregate_arg( [H1|T], Ons, N, Funct, Hons, HAgrs, Facts ) :- 2075 term_ons( Ons, H1, Hons1 ), 2076 arg( N, H1, Nth ), 2077 ( Hons = Hons1 -> 2078 HAgrs2 = [Nth|HAgrs], 2079 Facts = TFacts 2080 ; 2081 HAgrs2 = [Nth], 2082 Funct = Pname/Arity, 2083 functor( Fact, Pname, Arity ), 2084 de_singleton( HAgrs, Agr ), 2085 maplist( csv_atom_pl_term, Hons, Cons ), 2086 maplist( term_n_arg(Fact), [N|Ons], [Agr|Cons] ), 2087 Facts = [Fact|TFacts] 2088 ), 2089 ord_facts_aggregate_arg( T, Ons, N, Funct, Hons1, HAgrs2, TFacts ). 2090 2091term_n_arg( Term, N, Arg ) :- arg( N, Term, Arg ). 2092 2093pl_term_csv_atom( date(Yn,Mn,Dn), Csv ) :- 2094 !, 2095 maplist( atom_number, [Y,M,D], [Yn,Mn,Dn] ), 2096 atomic_list_concat( [Y,M,D], '/', Csv ). 2097pl_term_csv_atom( Term, Csv ) :- 2098 ( atomic(Term) -> 2099 ( Term == [] -> Csv = 0; Csv = Term ) 2100 ; 2101 ( is_list(Term) -> 2102 atomic_list_concat( Term, ';', Csv ) 2103 ; 2104 term_to_atom(Term,Csv) 2105 ) 2106 ). 2107 2108csv_atom_pl_term( 0, [] ) :- !. 2109csv_atom_pl_term( CsvDate, Date ) :- 2110 \+ CsvDate = [_|_], 2111 atomic_list_concat( [Y,D,M], '/', CsvDate ), 2112 !, 2113 Date = date(Y,D,M). 2114csv_atom_pl_term( Other, Other ). 2115 2116csv_row_to_nest_row( Csv, Fact ) :- 2117 functor( Csv, Pname, Functor ), 2118 functor( Fact, Pname, Functor ), 2119 arg( 1, Csv, First ), 2120 arg( 1, Fact, First ), 2121 arg( 2, Csv, Second ), 2122 csv_atom_pl_term( Second, Deutepo ), 2123 arg( 2, Fact, Deutepo ), 2124 ( Functor =:= 3 -> 2125 arg( 3, Csv, Third ), 2126 ( (Third=:=0;Third=='0') -> 2127 Tpith = [] 2128 ; 2129 atomic_list_concat( Atoms, ';', Third ), 2130 maplist( to_number, Atoms, Tpith ) 2131 ), 2132 arg( 3, Fact, Tpith ) 2133 ). 2134 2135assert_at( Module, Fact ) :- 2136 % mapargs( csv_atom_pl_term, Row, Fact ), 2137 assert( Module: ). 2138 2139get_url( Url ) :- 2140 file_base_name(Url, Base), 2141 directory_file_path(_Dir, Base, File), 2142 file_mime_type(File, Mime ), 2143 mime_file_type( Mime, FType ), 2144 get_url( Url, File, FType ). 2145 2146get_url( Url, File, Type ) :- 2147 setup_call_cleanup( 2148 http_open(Url, In, []), 2149 setup_call_cleanup( 2150 open(File, write, Out, [type(Type)]), 2151 copy_stream_data(In, Out), 2152 close(Out)), 2153 close(In) ). 2154 2155mime_file_type(text/_, text) :- !. 2156mime_file_type(_, binary). 2157% mime_file_type(text/_, binary). % was this changed it above ? 2158 2159a_month_ago( date(Y1,M1,D1) ) :- 2160 % date( date(Y,M,D) ), 2161 get_time( T ), 2162 stamp_date_time( T, Date, local ), 2163 Date = date(Y,M,D,_H,_N,_S,_,_,_), 2164 ( M =:= 1 -> Y1 is Y - 1, 2165 M1 is 12, 2166 D is D1 % Dec. has at least as many days as Jan. 2167 ; 2168 Y1 is Y, 2169 M1 is M - 1, 2170 % actually for this application D1 is D should suffice... 2171 month_days( M, Ds ), 2172 D1 is min(Ds,D) 2173 ). 2174 2175month_days( 1, 31 ). 2176month_days( 2, 28 ). % ok ok 2177month_days( 3, 31 ). 2178month_days( 4, 30 ). 2179month_days( 5, 31 ). 2180month_days( 6, 30 ). 2181month_days( 7, 31 ). 2182month_days( 8, 31 ). 2183month_days( 9, 30 ). 2184month_days( 10, 31 ). 2185month_days( 11, 30 ). 2186month_days( 12, 31 ). 2187 2188non_var_list( IdS, Ids ) :- 2189 \+ var(IdS), 2190 pg_en_list( IdS, Ids ). 2191 2192de_kv_list_on( [_K-Single], On, Single ) :- 2193 \+ is_list( On ), 2194 !. 2195de_kv_list_on( List, _On, List ). 2196 2197mapargs( Partial, Goal1, Goal2 ) :- 2198 functor( Goal1, Gname, Garity ), 2199 functor( Goal2, Gname, Garity ), 2200 mapargs_1( Garity, Partial, Goal1, Goal2 ). 2201 2202mapargs_1( 0, _Partial, _Goal1, _Goal2 ) :- !. 2203mapargs_1( I, Partial, Goal1, Goal2 ) :- 2204 % functor( Call, Pname, 2 ), 2205 arg( I, Goal1, Arg1 ), 2206 % arg( 1, Call, Arg1 ), 2207 % arg( 2, Call, Arg2 ), 2208 call( Partial, Arg1, Arg2 ), 2209 arg( I, Goal2, Arg2 ), 2210 K is I - 1, 2211 mapargs_1( K, Partial, Goal1, Goal2 ). 2212 2213to_number( Atom, Num ) :- 2214 atom( Atom ), 2215 !, 2216 atom_number( Atom, Num ). 2217to_number( Num, Num ). 2218 2219nest_pair_flatten_removes( [], [] ). 2220nest_pair_flatten_removes( [K-(_-V)|T], [K-V|M] ) :- 2221 nest_pair_flatten_removes( T, M ). 2222 2223kv_decompose_vs( [], [] ). 2224kv_decompose_vs( [_K-V|T], [V|Tv] ) :- 2225 kv_decompose_vs( T, Tv ). 2226 2227pg_en_list( List, TheList ) :- 2228 is_list(List), 2229 !, 2230 TheList = List. 2231pg_en_list( Elem, [Elem] ). 2232 2233semscholar_id_json( Id, Json ) :- 2234 url_semscholar( SemScholar ), 2235 Incl = '?include_unknown_references=true', 2236 atomic_list_concat( [SemScholar,Id,Incl], '', Url ), 2237 http_open( Url, In, [] ), 2238 % json_read_dict( In, Dict ), 2239 json_read( In, JsonT ), 2240 close( In ), 2241 JsonT = json(Json). 2242 2243json_value( Json, Atom ) :- 2244 atomic( Json ), 2245 !, 2246 Atom = Json. 2247json_value( Json, Names ) :- 2248 findall( Name, ( member(json(Sub),Json), 2249 member(name=Name,Sub) 2250 ), 2251 Names )
Access, cache and visualise citation relations in publications servers.
A simple library for communicating with publication information servers: pub med and semantic scholar.
Currently allows (a) searching on conjunctions and disjunctions, (b) fetching the details of a paper
(c) the publications citing a paper, (d) publications cited by a paper, (e) simple reporting of fetched information and (f) storing fethed information to local databases.
Since version 0.1 the library supports caching of the paper information on Prolog term or csv data files
and odbc connected or sqlite databases. Also as of 0.1 pub_graph is debug/1 aware. To see information regarding
the progress of execution, use
The pack requires the curl executable to be in the path. Only tested on Linux.
It is being developed on SWI-Prolog 6.1.8 and it should also work on Yap Prolog.
To install under SWI simply do
The storing of paper and citation depends on db_facts and for sqlite connectivity on proSQlite (both available as SWI packs and from http://stoics.org.uk/~nicos/sware/)
info(ID,Key,O,Value)
. But Key is really a type of information. so we could split this to a number of tables(info:)key(Id,O,Value)
. Alternatively you could make key an enumerate type, which will save loads of space*/