1/*  Part of SWI-Prolog
    2
    3    Author:        Jan Wielemaker
    4    E-mail:        J.Wielemaker@vu.nl
    5    WWW:           http://www.swi-prolog.org
    6    Copyright (c)  2002-2020, University of Amsterdam
    7                              VU University Amsterdam
    8                              CWI, Amsterdam
    9    All rights reserved.
   10
   11    Redistribution and use in source and binary forms, with or without
   12    modification, are permitted provided that the following conditions
   13    are met:
   14
   15    1. Redistributions of source code must retain the above copyright
   16       notice, this list of conditions and the following disclaimer.
   17
   18    2. Redistributions in binary form must reproduce the above copyright
   19       notice, this list of conditions and the following disclaimer in
   20       the documentation and/or other materials provided with the
   21       distribution.
   22
   23    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
   24    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
   25    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
   26    FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
   27    COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
   28    INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
   29    BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
   30    LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
   31    CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
   32    LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
   33    ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
   34    POSSIBILITY OF SUCH DAMAGE.
   35*/
   36
   37:- module(http_open_cp,
   38          [ http_open/3,                % +URL, -Stream, +Options
   39            http_set_authorization/2,   % +URL, +Authorization
   40            http_close_keep_alive/1     % +Address
   41          ]).   42:- autoload(library(aggregate),[aggregate_all/3]).   43:- autoload(library(apply),[foldl/4,include/3]).   44:- autoload(library(base64),[base64/3]).   45:- autoload(library(debug),[debug/3,debugging/1]).   46:- autoload(library(error),
   47	    [ domain_error/2, must_be/2, existence_error/2, instantiation_error/1
   48	    ]).   49:- autoload(library(lists),[last/2,member/2]).   50:- autoload(library(option),
   51	    [ meta_options/3, option/2, select_option/4, merge_options/3,
   52	      option/3, select_option/3
   53	    ]).   54:- autoload(library(readutil),[read_line_to_codes/2]).   55:- autoload(library(uri),
   56	    [ uri_resolve/3, uri_components/2, uri_data/3,
   57              uri_authority_components/2, uri_authority_data/3,
   58	      uri_encoded/3, uri_query_components/2, uri_is_global/1
   59	    ]).   60:- autoload(library(http/http_header_cp),
   61            [ http_parse_header/2, http_post_data/3 ]).   62:- autoload(library(http/http_stream),[stream_range_open/3]).   63:- if(exists_source(library(ssl))).   64:- autoload(library(ssl), [ssl_upgrade_legacy_options/2]).   65:- endif.   66:- use_module(library(socket)).

HTTP client library

This library defines http_open/3, which opens a URL as a Prolog stream. The functionality of the library can be extended by loading two additional modules that act as plugins:

library(http/http_ssl_plugin)
Loading this library causes http_open/3 to handle HTTPS connections. Relevant options for SSL certificate handling are handed to ssl_context/3. This plugin is loaded automatically if the scheme https is requested using a default SSL context. See the plugin for additional information regarding security.
library(zlib)
Loading this library supports the gzip transfer encoding. This plugin is lazily loaded if a connection is opened that claims this transfer encoding.
library(http/http_cookie)
Loading this library adds tracking cookies to http_open/3. Returned cookies are collected in the Prolog database and supplied for subsequent requests.
library(http/http_stream)
This library adds support for chunked encoding and makes the http_open/3 advertise itself as HTTP/1.1 instead of HTTP/1.0.

Here is a simple example to fetch a web-page:

?- http_open('http://www.google.com/search?q=prolog', In, []),
   copy_stream_data(In, user_output),
   close(In).
<!doctype html><head><title>prolog - Google Search</title><script>
...

The example below fetches the modification time of a web-page. Note that Modified is '' (the empty atom) if the web-server does not provide a time-stamp for the resource. See also parse_time/2.

modified(URL, Stamp) :-
        http_open(URL, In,
                  [ method(head),
                    header(last_modified, Modified)
                  ]),
        close(In),
        Modified \== '',
        parse_time(Modified, Stamp).

Then next example uses Google search. It exploits library(uri) to manage URIs, library(sgml) to load an HTML document and library(xpath) to navigate the parsed HTML. Note that you may need to adjust the XPath queries if the data returned by Google changes.

:- use_module(library(http/http_open)).
:- use_module(library(xpath)).
:- use_module(library(sgml)).
:- use_module(library(uri)).

google(For, Title, HREF) :-
        uri_encoded(query_value, For, Encoded),
        atom_concat('http://www.google.com/search?q=', Encoded, URL),
        http_open(URL, In, []),
        call_cleanup(
            load_html(In, DOM, []),
            close(In)),
        xpath(DOM, //h3(@class=r), Result),
        xpath(Result, //a(@href=HREF0, text), Title),
        uri_components(HREF0, Components),
        uri_data(search, Components, Query),
        uri_query_components(Query, Parts),
        memberchk(q=HREF, Parts).

An example query is below:

?- google(prolog, Title, HREF).
Title = 'SWI-Prolog',
HREF = 'http://www.swi-prolog.org/' ;
Title = 'Prolog - Wikipedia',
HREF = 'https://nl.wikipedia.org/wiki/Prolog' ;
Title = 'Prolog - Wikipedia, the free encyclopedia',
HREF = 'https://en.wikipedia.org/wiki/Prolog' ;
Title = 'Pro-Log is logistiek dienstverlener m.b.t. vervoer over water.',
HREF = 'http://www.pro-log.nl/' ;
Title = 'Learn Prolog Now!',
HREF = 'http://www.learnprolognow.org/' ;
Title = 'Free Online Version - Learn Prolog
...
See also
- load_html/3 and xpath/3 can be used to parse and navigate HTML documents.
- http_get/3 and http_post/4 provide an alternative interface that convert the reply depending on the Content-Type header. */
  172:- multifile
  173    http:encoding_filter/3,           % +Encoding, +In0, -In
  174    http:current_transfer_encoding/1, % ?Encoding
  175    http:disable_encoding_filter/1,   % +ContentType
  176    http:http_protocol_hook/5,        % +Protocol, +Parts, +StreamPair,
  177                                      % -NewStreamPair, +Options
  178    http:open_options/2,              % +Parts, -Options
  179    http:write_cookies/3,             % +Out, +Parts, +Options
  180    http:update_cookies/3,            % +CookieLine, +Parts, +Options
  181    http:authenticate_client/2,       % +URL, +Action
  182    http:http_connection_over_proxy/6.  183
  184:- meta_predicate
  185    http_open(+,-,:).  186
  187:- predicate_options(http_open/3, 3,
  188                     [ authorization(compound),
  189                       final_url(-atom),
  190                       header(+atom, -atom),
  191                       headers(-list),
  192                       connection(+atom),
  193                       method(oneof([delete,get,put,head,post,patch,options])),
  194                       size(-integer),
  195                       status_code(-integer),
  196                       output(-stream),
  197                       timeout(number),
  198                       unix_socket(+atom),
  199                       proxy(atom, integer),
  200                       proxy_authorization(compound),
  201                       bypass_proxy(boolean),
  202                       request_header(any),
  203                       user_agent(atom),
  204                       version(-compound),
  205        % The option below applies if library(http/http_header) is loaded
  206                       post(any),
  207        % The options below apply if library(http/http_ssl_plugin)) is loaded
  208                       pem_password_hook(callable),
  209                       cacert_file(atom),
  210                       cert_verify_hook(callable)
  211                     ]).
 user_agent(-Agent) is det
Default value for User-Agent, can be overruled using the option user_agent(Agent) of http_open/3.
  218user_agent('SWI-Prolog').
 http_open(+URL, -Stream, +Options) is det
Open the data at the HTTP server as a Prolog stream. URL is either an atom specifying a URL or a list representing a broken-down URL as specified below. After this predicate succeeds the data can be read from Stream. After completion this stream must be closed using the built-in Prolog predicate close/1. Options provides additional options:
authenticate(+Boolean)
If false (default true), do not try to automatically authenticate the client if a 401 (Unauthorized) status code is received.
authorization(+Term)
Send authorization. See also http_set_authorization/2. Supported schemes:
basic(+User, +Password)
HTTP Basic authentication.
bearer(+Token)
HTTP Bearer authentication.
digest(+User, +Password)
HTTP Digest authentication. This option is only provided if the plugin library(http/http_digest) is also loaded.
unix_socket(+Path)
Connect to the given Unix domain socket. In this scenario the host name and port or ignored. If the server replies with a redirect message and the host differs from the original host as normal TCP connection is used to handle the redirect. This option is inspired by curl(1)'s option `--unix-socket`.
connection(+Connection)
Specify the Connection header. Default is close. The alternative is Keep-alive. This maintains a pool of available connections as determined by keep_connection/1. The library(http/websockets) uses Keep-alive, Upgrade. Keep-alive connections can be closed explicitly using http_close_keep_alive/1. Keep-alive connections may significantly improve repetitive requests on the same server, especially if the IP route is long, HTTPS is used or the connection uses a proxy.
final_url(-FinalURL)
Unify FinalURL with the final destination. This differs from the original URL if the returned head of the original indicates an HTTP redirect (codes 301, 302 or 303). Without a redirect, FinalURL is the same as URL if URL is an atom, or a URL constructed from the parts.
header(Name, -AtomValue)
If provided, AtomValue is unified with the value of the indicated field in the reply header. Name is matched case-insensitive and the underscore (_) matches the hyphen (-). Multiple of these options may be provided to extract multiple header fields. If the header is not available AtomValue is unified to the empty atom ('').
headers(-List)
If provided, List is unified with a list of Name(Value) pairs corresponding to fields in the reply header. Name and Value follow the same conventions used by the header(Name,Value) option.
method(+Method)
One of get (default), head, delete, post, put or patch. The head message can be used in combination with the header(Name, Value) option to access information on the resource without actually fetching the resource itself. The returned stream must be closed immediately.

If post(Data) is provided, the default is post.

size(-Size)
Size is unified with the integer value of Content-Length in the reply header.
version(-Version)
Version is a pair Major-Minor, where Major and Minor are integers representing the HTTP version in the reply header.
range(+Range)
Ask for partial content. Range is a term Unit(From,To), where From is an integer and To is either an integer or the atom end. HTTP 1.1 only supports Unit = bytes. E.g., to ask for bytes 1000-1999, use the option range(bytes(1000,1999))
redirect(+Boolean)
If false (default true), do not automatically redirect if a 3XX code is received. Must be combined with status_code(Code) and one of the header options to read the redirect reply. In particular, without status_code(Code) a redirect is mapped to an exception.
status_code(-Code)
If this option is present and Code unifies with the HTTP status code, do not translate errors (4xx, 5xx) into an exception. Instead, http_open/3 behaves as if 2xx (success) is returned, providing the application to read the error document from the returned stream.
output(-Out)
Unify the output stream with Out and do not close it. This can be used to upgrade a connection.
timeout(+Timeout)
If provided, set a timeout on the stream using set_stream/2. With this option if no new data arrives within Timeout seconds the stream raises an exception. Default is to wait forever (infinite).
post(+Data)
Issue a POST request on the HTTP server. Data is handed to http_post_data/3.
proxy(+Host:Port)
Use an HTTP proxy to connect to the outside world. See also socket:proxy_for_url/3. This option overrules the proxy specification defined by socket:proxy_for_url/3.
proxy(+Host, +Port)
Synonym for proxy(+Host:Port). Deprecated.
proxy_authorization(+Authorization)
Send authorization to the proxy. Otherwise the same as the authorization option.
bypass_proxy(+Boolean)
If true, bypass proxy hooks. Default is false.
request_header(Name=Value)
Additional name-value parts are added in the order of appearance to the HTTP request header. No interpretation is done.
max_redirect(+Max)
Sets the maximum length of a redirection chain. This is needed for some IRIs that redirect indefinitely to other IRIs without looping (e.g., redirecting to IRIs with a random element in them). Max must be either a non-negative integer or the atom infinite. The default value is 10.
user_agent(+Agent)
Defines the value of the User-Agent field of the HTTP header. Default is SWI-Prolog.

The hook http:open_options/2 can be used to provide default options based on the broken-down URL. The option status_code(-Code) is particularly useful to query REST interfaces that commonly return status codes other than 200 that need to be be processed by the client code.

Arguments:
URL- is either an atom or string (url) or a list of parts.

When provided, this list may contain the fields scheme, user, password, host, port, path and either query_string (whose argument is an atom) or search (whose argument is a list of Name(Value) or Name=Value compound terms). Only host is mandatory. The example below opens the URL http://www.example.com/my/path?q=Hello%20World&lang=en. Note that values must not be quoted because the library inserts the required quotes.

http_open([ host('www.example.com'),
            path('/my/path'),
            search([ q='Hello world',
                     lang=en
                   ])
          ])
throws
- error(existence_error(url, Id),Context) is raised if the HTTP result code is not in the range 200..299. Context has the shape context(Message, status(Code, TextCode)), where Code is the numeric HTTP code and TextCode is the textual description thereof provided by the server. Message may provide additional details or may be unbound.
See also
- ssl_context/3 for SSL related options if library(http/http_ssl_plugin) is loaded.
  408:- multifile
  409    socket:proxy_for_url/3.           % +URL, +Host, -ProxyList
  410
  411http_open(URL, Stream, QOptions) :-
  412    meta_options(is_meta, QOptions, Options0),
  413    (   atomic(URL)
  414    ->  parse_url_ex(URL, Parts)
  415    ;   Parts = URL
  416    ),
  417    autoload_https(Parts),
  418    upgrade_ssl_options(Parts, Options0, Options),
  419    add_authorization(Parts, Options, Options1),
  420    findall(HostOptions, hooked_options(Parts, HostOptions), AllHostOptions),
  421    foldl(merge_options_rev, AllHostOptions, Options1, Options2),
  422    (   option(bypass_proxy(true), Options)
  423    ->  try_http_proxy(direct, Parts, Stream, Options2)
  424    ;   term_variables(Options2, Vars2),
  425        findall(Result-Vars2,
  426                try_a_proxy(Parts, Result, Options2),
  427                ResultList),
  428        last(ResultList, Status-Vars2)
  429    ->  (   Status = true(_Proxy, Stream)
  430        ->  true
  431        ;   throw(error(proxy_error(tried(ResultList)), _))
  432        )
  433    ;   try_http_proxy(direct, Parts, Stream, Options2)
  434    ).
  435
  436try_a_proxy(Parts, Result, Options) :-
  437    parts_uri(Parts, AtomicURL),
  438    option(host(Host), Parts),
  439    (   option(unix_socket(Path), Options)
  440    ->  Proxy = unix_socket(Path)
  441    ;   (   option(proxy(ProxyHost:ProxyPort), Options)
  442        ;   is_list(Options),
  443            memberchk(proxy(ProxyHost,ProxyPort), Options)
  444        )
  445    ->  Proxy = proxy(ProxyHost, ProxyPort)
  446    ;   socket:proxy_for_url(AtomicURL, Host, Proxy)
  447    ),
  448    debug(http(proxy),
  449          'http_open: Connecting via ~w to ~w', [Proxy, AtomicURL]),
  450    (   catch(try_http_proxy(Proxy, Parts, Stream, Options), E, true)
  451    ->  (   var(E)
  452        ->  !, Result = true(Proxy, Stream)
  453        ;   Result = error(Proxy, E)
  454        )
  455    ;   Result = false(Proxy)
  456    ),
  457    debug(http(proxy), 'http_open: ~w: ~p', [Proxy, Result]).
  458
  459try_http_proxy(Method, Parts, Stream, Options0) :-
  460    option(host(Host), Parts),
  461    proxy_request_uri(Method, Parts, RequestURI),
  462    select_option(visited(Visited0), Options0, OptionsV, []),
  463    Options = [visited([Parts|Visited0])|OptionsV],
  464    parts_scheme(Parts, Scheme),
  465    default_port(Scheme, DefPort),
  466    url_part(port(Port), Parts, DefPort),
  467    host_and_port(Host, DefPort, Port, HostPort),
  468    (   option(connection(Connection), Options0),
  469        keep_alive(Connection),
  470        get_from_pool(Host:Port, StreamPair),
  471        debug(http(connection), 'Trying Keep-alive to ~p using ~p',
  472              [ Host:Port, StreamPair ]),
  473        catch(send_rec_header(StreamPair, Stream, HostPort,
  474                              RequestURI, Parts, Options),
  475              error(E,_),
  476              keep_alive_error(E))
  477    ->  true
  478    ;   http:http_connection_over_proxy(Method, Parts, Host:Port,
  479                                        SocketStreamPair, Options, Options1),
  480        (   catch(http:http_protocol_hook(Scheme, Parts,
  481                                          SocketStreamPair,
  482                                          StreamPair, Options),
  483                  Error,
  484                  ( close(SocketStreamPair, [force(true)]),
  485                    throw(Error)))
  486        ->  true
  487        ;   StreamPair = SocketStreamPair
  488        ),
  489        send_rec_header(StreamPair, Stream, HostPort,
  490                        RequestURI, Parts, Options1)
  491    ),
  492    return_final_url(Options).
  493
  494proxy_request_uri(direct, Parts, RequestURI) :-
  495    !,
  496    parts_request_uri(Parts, RequestURI).
  497proxy_request_uri(unix_socket(_), Parts, RequestURI) :-
  498    !,
  499    parts_request_uri(Parts, RequestURI).
  500proxy_request_uri(_, Parts, RequestURI) :-
  501    parts_uri(Parts, RequestURI).
  502
  503http:http_connection_over_proxy(unix_socket(Path), _, _,
  504                                StreamPair, Options, Options) :-
  505    !,
  506    unix_domain_socket(Socket),
  507    tcp_connect(Socket, Path),
  508    tcp_open_socket(Socket, In, Out),
  509    stream_pair(StreamPair, In, Out).
  510http:http_connection_over_proxy(direct, _, Host:Port,
  511                                StreamPair, Options, Options) :-
  512    !,
  513    open_socket(Host:Port, StreamPair, Options).
  514http:http_connection_over_proxy(proxy(ProxyHost, ProxyPort), Parts, _,
  515                                StreamPair, Options, Options) :-
  516    \+ ( memberchk(scheme(Scheme), Parts),
  517         secure_scheme(Scheme)
  518       ),
  519    !,
  520    % We do not want any /more/ proxy after this
  521    open_socket(ProxyHost:ProxyPort, StreamPair,
  522                [bypass_proxy(true)|Options]).
  523http:http_connection_over_proxy(socks(SocksHost, SocksPort), _Parts, Host:Port,
  524                                StreamPair, Options, Options) :-
  525    !,
  526    tcp_connect(SocksHost:SocksPort, StreamPair, [bypass_proxy(true)]),
  527    catch(negotiate_socks_connection(Host:Port, StreamPair),
  528          Error,
  529          ( close(StreamPair, [force(true)]),
  530            throw(Error)
  531          )).
 hooked_options(+Parts, -Options) is nondet
Calls http:open_options/2 and if necessary upgrades old SSL cacerts_file(File) option to a cacerts(List) option to ensure proper merging of options.
  539hooked_options(Parts, Options) :-
  540    http:open_options(Parts, Options0),
  541    upgrade_ssl_options(Parts, Options0, Options).
  542
  543:- if(current_predicate(ssl_upgrade_legacy_options/2)).  544upgrade_ssl_options(Parts, Options0, Options) :-
  545    requires_ssl(Parts),
  546    !,
  547    ssl_upgrade_legacy_options(Options0, Options).
  548:- endif.  549upgrade_ssl_options(_, Options, Options).
  550
  551merge_options_rev(Old, New, Merged) :-
  552    merge_options(New, Old, Merged).
  553
  554is_meta(pem_password_hook).             % SSL plugin callbacks
  555is_meta(cert_verify_hook).
  556
  557
  558http:http_protocol_hook(http, _, StreamPair, StreamPair, _).
  559
  560default_port(https, 443) :- !.
  561default_port(wss,   443) :- !.
  562default_port(_,     80).
  563
  564host_and_port(Host, DefPort, DefPort, Host) :- !.
  565host_and_port(Host, _,       Port,    Host:Port).
 autoload_https(+Parts) is det
If the requested scheme is https or wss, load the HTTPS plugin.
  571autoload_https(Parts) :-
  572    requires_ssl(Parts),
  573    memberchk(scheme(S), Parts),
  574    \+ clause(http:http_protocol_hook(S, _, StreamPair, StreamPair, _),_),
  575    exists_source(library(http/http_ssl_plugin)),
  576    !,
  577    use_module(library(http/http_ssl_plugin)).
  578autoload_https(_).
  579
  580requires_ssl(Parts) :-
  581    memberchk(scheme(S), Parts),
  582    secure_scheme(S).
  583
  584secure_scheme(https).
  585secure_scheme(wss).
 send_rec_header(+StreamPair, -Stream, +Host, +RequestURI, +Parts, +Options) is det
Send header to Out and process reply. If there is an error or failure, close In and Out and return the error or failure.
  593send_rec_header(StreamPair, Stream, Host, RequestURI, Parts, Options) :-
  594    (   catch(guarded_send_rec_header(StreamPair, Stream,
  595                                      Host, RequestURI, Parts, Options),
  596              E, true)
  597    ->  (   var(E)
  598        ->  (   option(output(StreamPair), Options)
  599            ->  true
  600            ;   true
  601            )
  602        ;   close(StreamPair, [force(true)]),
  603            throw(E)
  604        )
  605    ;   close(StreamPair, [force(true)]),
  606        fail
  607    ).
  608
  609guarded_send_rec_header(StreamPair, Stream, Host, RequestURI, Parts, Options) :-
  610    user_agent(Agent, Options),
  611    method(Options, MNAME),
  612    http_version(Version),
  613    option(connection(Connection), Options, close),
  614    debug(http(send_request), "> ~w ~w HTTP/~w", [MNAME, RequestURI, Version]),
  615    debug(http(send_request), "> Host: ~w", [Host]),
  616    debug(http(send_request), "> User-Agent: ~w", [Agent]),
  617    debug(http(send_request), "> Connection: ~w", [Connection]),
  618    format(StreamPair,
  619           '~w ~w HTTP/~w\r\n\c
  620               Host: ~w\r\n\c
  621               User-Agent: ~w\r\n\c
  622               Connection: ~w\r\n',
  623           [MNAME, RequestURI, Version, Host, Agent, Connection]),
  624    parts_uri(Parts, URI),
  625    x_headers(Options, URI, StreamPair),
  626    write_cookies(StreamPair, Parts, Options),
  627    (   option(post(PostData), Options)
  628    ->  http_post_data(PostData, StreamPair, [])
  629    ;   format(StreamPair, '\r\n', [])
  630    ),
  631    flush_output(StreamPair),
  632                                    % read the reply header
  633    read_header(StreamPair, Parts, ReplyVersion, Code, Comment, Lines),
  634    update_cookies(Lines, Parts, Options),
  635    ignore(option(raw_headers(Lines), Options)),
  636    do_open(ReplyVersion, Code, Comment, Lines, Options, Parts, Host,
  637            StreamPair, Stream).
 http_version(-Version:atom) is det
HTTP version we publish. We can only use 1.1 if we support chunked encoding.
  645http_version('1.1') :-
  646    http:current_transfer_encoding(chunked),
  647    !.
  648http_version('1.0').
  649
  650method(Options, MNAME) :-
  651    option(post(_), Options),
  652    !,
  653    option(method(M), Options, post),
  654    (   map_method(M, MNAME0)
  655    ->  MNAME = MNAME0
  656    ;   domain_error(method, M)
  657    ).
  658method(Options, MNAME) :-
  659    option(method(M), Options, get),
  660    (   map_method(M, MNAME0)
  661    ->  MNAME = MNAME0
  662    ;   map_method(_, M)
  663    ->  MNAME = M
  664    ;   domain_error(method, M)
  665    ).
 map_method(+MethodID, -Method)
Support additional METHOD keywords. Default are the official HTTP methods as defined by the various RFCs.
  672:- multifile
  673    map_method/2.  674
  675map_method(delete,  'DELETE').
  676map_method(get,     'GET').
  677map_method(head,    'HEAD').
  678map_method(post,    'POST').
  679map_method(put,     'PUT').
  680map_method(patch,   'PATCH').
  681map_method(options, 'OPTIONS').
 x_headers(+Options, +URI, +Out) is det
Emit extra headers from request_header(Name=Value) options in Options.
To be done
- Use user/password fields
  690x_headers(Options, URI, Out) :-
  691    x_headers_(Options, [url(URI)|Options], Out).
  692
  693x_headers_([], _, _).
  694x_headers_([H|T], Options, Out) :-
  695    x_header(H, Options, Out),
  696    x_headers_(T, Options, Out).
  697
  698x_header(request_header(Name=Value), _, Out) :-
  699    !,
  700    debug(http(send_request), "> ~w: ~w", [Name, Value]),
  701    format(Out, '~w: ~w\r\n', [Name, Value]).
  702x_header(proxy_authorization(ProxyAuthorization), Options, Out) :-
  703    !,
  704    auth_header(ProxyAuthorization, Options, 'Proxy-Authorization', Out).
  705x_header(authorization(Authorization), Options, Out) :-
  706    !,
  707    auth_header(Authorization, Options, 'Authorization', Out).
  708x_header(range(Spec), _, Out) :-
  709    !,
  710    Spec =.. [Unit, From, To],
  711    (   To == end
  712    ->  ToT = ''
  713    ;   must_be(integer, To),
  714        ToT = To
  715    ),
  716    debug(http(send_request), "> Range: ~w=~d-~w", [Unit, From, ToT]),
  717    format(Out, 'Range: ~w=~d-~w\r\n', [Unit, From, ToT]).
  718x_header(_, _, _).
 auth_header(+AuthOption, +Options, +HeaderName, +Out)
  722auth_header(basic(User, Password), _, Header, Out) :-
  723    !,
  724    format(codes(Codes), '~w:~w', [User, Password]),
  725    phrase(base64(Codes), Base64Codes),
  726    debug(http(send_request), "> ~w: Basic ~s", [Header, Base64Codes]),
  727    format(Out, '~w: Basic ~s\r\n', [Header, Base64Codes]).
  728auth_header(bearer(Token), _, Header, Out) :-
  729    !,
  730    debug(http(send_request), "> ~w: Bearer ~w", [Header,Token]),
  731    format(Out, '~w: Bearer ~w\r\n', [Header, Token]).
  732auth_header(Auth, Options, _, Out) :-
  733    option(url(URL), Options),
  734    add_method(Options, Options1),
  735    http:authenticate_client(URL, send_auth_header(Auth, Out, Options1)),
  736    !.
  737auth_header(Auth, _, _, _) :-
  738    domain_error(authorization, Auth).
  739
  740user_agent(Agent, Options) :-
  741    (   option(user_agent(Agent), Options)
  742    ->  true
  743    ;   user_agent(Agent)
  744    ).
  745
  746add_method(Options0, Options) :-
  747    option(method(_), Options0),
  748    !,
  749    Options = Options0.
  750add_method(Options0, Options) :-
  751    option(post(_), Options0),
  752    !,
  753    Options = [method(post)|Options0].
  754add_method(Options0, [method(get)|Options0]).
 do_open(+HTTPVersion, +HTTPStatusCode, +HTTPStatusComment, +Header, +Options, +Parts, +Host, +In, -FinalIn) is det
Handle the HTTP status once available. If 200-299, we are ok. If a redirect, redo the open, returning a new stream. Else issue an error.
Errors
- existence_error(url, URL)
  765                                        % Redirections
  766do_open(_, Code, _, Lines, Options0, Parts, _, In, Stream) :-
  767    redirect_code(Code),
  768    option(redirect(true), Options0, true),
  769    location(Lines, RequestURI),
  770    !,
  771    debug(http(redirect), 'http_open: redirecting to ~w', [RequestURI]),
  772    close(In),
  773    parts_uri(Parts, Base),
  774    uri_resolve(RequestURI, Base, Redirected),
  775    parse_url_ex(Redirected, RedirectedParts),
  776    (   redirect_limit_exceeded(Options0, Max)
  777    ->  format(atom(Comment), 'max_redirect (~w) limit exceeded', [Max]),
  778        throw(error(permission_error(redirect, http, Redirected),
  779                    context(_, Comment)))
  780    ;   redirect_loop(RedirectedParts, Options0)
  781    ->  throw(error(permission_error(redirect, http, Redirected),
  782                    context(_, 'Redirection loop')))
  783    ;   true
  784    ),
  785    redirect_options(Parts, RedirectedParts, Options0, Options),
  786    http_open(RedirectedParts, Stream, Options).
  787                                        % Need authentication
  788do_open(_Version, Code, _Comment, Lines, Options0, Parts, _Host, In0, Stream) :-
  789    authenticate_code(Code),
  790    option(authenticate(true), Options0, true),
  791    parts_uri(Parts, URI),
  792    parse_headers(Lines, Headers),
  793    http:authenticate_client(
  794             URI,
  795             auth_reponse(Headers, Options0, Options)),
  796    !,
  797    close(In0),
  798    http_open(Parts, Stream, Options).
  799                                        % Accepted codes
  800do_open(Version, Code, _, Lines, Options, Parts, Host, In0, In) :-
  801    (   option(status_code(Code), Options),
  802        Lines \== []
  803    ->  true
  804    ;   successful_code(Code)
  805    ),
  806    !,
  807    parts_uri(Parts, URI),
  808    parse_headers(Lines, Headers),
  809    return_version(Options, Version),
  810    return_size(Options, Headers),
  811    return_fields(Options, Headers),
  812    return_headers(Options, Headers),
  813    consider_keep_alive(Lines, Parts, Host, In0, In1, Options),
  814    transfer_encoding_filter(Lines, In1, In),
  815                                    % properly re-initialise the stream
  816    set_stream(In, file_name(URI)),
  817    set_stream(In, record_position(true)).
  818do_open(_, _, _, [], Options, _, _, _, _) :-
  819    option(connection(Connection), Options),
  820    keep_alive(Connection),
  821    !,
  822    throw(error(keep_alive(closed),_)).
  823                                        % report anything else as error
  824do_open(_Version, Code, Comment, _,  _, Parts, _, _, _) :-
  825    parts_uri(Parts, URI),
  826    (   map_error_code(Code, Error)
  827    ->  Formal =.. [Error, url, URI]
  828    ;   Formal = existence_error(url, URI)
  829    ),
  830    throw(error(Formal, context(_, status(Code, Comment)))).
  831
  832
  833successful_code(Code) :-
  834    between(200, 299, Code).
 redirect_limit_exceeded(+Options:list(compound), -Max:nonneg) is semidet
True if we have exceeded the maximum redirection length (default 10).
  840redirect_limit_exceeded(Options, Max) :-
  841    option(visited(Visited), Options, []),
  842    length(Visited, N),
  843    option(max_redirect(Max), Options, 10),
  844    (Max == infinite -> fail ; N > Max).
 redirect_loop(+Parts, +Options) is semidet
True if we are in a redirection loop. Note that some sites redirect once to the same place using cookies or similar, so we allow for two tries. In fact, we should probably test whether authorization or cookie headers have changed.
  854redirect_loop(Parts, Options) :-
  855    option(visited(Visited), Options, []),
  856    include(==(Parts), Visited, Same),
  857    length(Same, Count),
  858    Count > 2.
 redirect_options(+Parts, +RedirectedParts, +Options0, -Options) is det
A redirect from a POST should do a GET on the returned URI. This means we must remove the method(post) and post(Data) options from the original option-list.

If we are connecting over a Unix domain socket we drop this option if the redirect host does not match the initial host.

  870redirect_options(Parts, RedirectedParts, Options0, Options) :-
  871    select_option(unix_socket(_), Options0, Options1),
  872    memberchk(host(Host), Parts),
  873    memberchk(host(RHost), RedirectedParts),
  874    debug(http(redirect), 'http_open: redirecting AF_UNIX ~w to ~w',
  875          [Host, RHost]),
  876    Host \== RHost,
  877    !,
  878    redirect_options(Options1, Options).
  879redirect_options(_, _, Options0, Options) :-
  880    redirect_options(Options0, Options).
  881
  882redirect_options(Options0, Options) :-
  883    (   select_option(post(_), Options0, Options1)
  884    ->  true
  885    ;   Options1 = Options0
  886    ),
  887    (   select_option(method(Method), Options1, Options),
  888        \+ redirect_method(Method)
  889    ->  true
  890    ;   Options = Options1
  891    ).
  892
  893redirect_method(delete).
  894redirect_method(get).
  895redirect_method(head).
 map_error_code(+HTTPCode, -PrologError) is semidet
Map HTTP error codes to Prolog errors.
To be done
- Many more maps. Unfortunately many have no sensible Prolog counterpart.
  905map_error_code(401, permission_error).
  906map_error_code(403, permission_error).
  907map_error_code(404, existence_error).
  908map_error_code(405, permission_error).
  909map_error_code(407, permission_error).
  910map_error_code(410, existence_error).
  911
  912redirect_code(301).                     % Moved Permanently
  913redirect_code(302).                     % Found (previously "Moved Temporary")
  914redirect_code(303).                     % See Other
  915redirect_code(307).                     % Temporary Redirect
  916
  917authenticate_code(401).
 open_socket(+Address, -StreamPair, +Options) is det
Create and connect a client socket to Address. Options
timeout(+Timeout)
Sets timeout on the stream, after connecting the socket.
To be done
- Make timeout also work on tcp_connect/4.
- This is the same as do_connect/4 in http_client.pl
  930open_socket(Address, StreamPair, Options) :-
  931    debug(http(open), 'http_open: Connecting to ~p ...', [Address]),
  932    tcp_connect(Address, StreamPair, Options),
  933    stream_pair(StreamPair, In, Out),
  934    debug(http(open), '\tok ~p ---> ~p', [In, Out]),
  935    set_stream(In, record_position(false)),
  936    (   option(timeout(Timeout), Options)
  937    ->  set_stream(In, timeout(Timeout))
  938    ;   true
  939    ).
  940
  941
  942return_version(Options, Major-Minor) :-
  943    option(version(Major-Minor), Options, _).
  944
  945return_size(Options, Headers) :-
  946    (   memberchk(content_length(Size), Headers)
  947    ->  option(size(Size), Options, _)
  948    ;   true
  949    ).
  950
  951return_fields([], _).
  952return_fields([header(Name, Value)|T], Headers) :-
  953    !,
  954    (   Term =.. [Name,Value],
  955        memberchk(Term, Headers)
  956    ->  true
  957    ;   Value = ''
  958    ),
  959    return_fields(T, Headers).
  960return_fields([_|T], Lines) :-
  961    return_fields(T, Lines).
  962
  963return_headers(Options, Headers) :-
  964    option(headers(Headers), Options, _).
 parse_headers(+Lines, -Headers:list(compound)) is det
Parse the header lines for the headers(-List) option. Invalid header lines are skipped, printing a warning using pring_message/2.
  972parse_headers([], []) :- !.
  973parse_headers([Line|Lines], Headers) :-
  974    catch(http_parse_header(Line, [Header]), Error, true),
  975    (   var(Error)
  976    ->  Headers = [Header|More]
  977    ;   print_message(warning, Error),
  978        Headers = More
  979    ),
  980    parse_headers(Lines, More).
 return_final_url(+Options) is semidet
If Options contains final_url(URL), unify URL with the final URL after redirections.
  988return_final_url(Options) :-
  989    option(final_url(URL), Options),
  990    var(URL),
  991    !,
  992    option(visited([Parts|_]), Options),
  993    parts_uri(Parts, URL).
  994return_final_url(_).
 transfer_encoding_filter(+Lines, +In0, -In) is det
Install filters depending on the transfer encoding. If In0 is a stream-pair, we close the output side. If transfer-encoding is not specified, the content-encoding is interpreted as a synonym for transfer-encoding, because many servers incorrectly depend on this. Exceptions to this are content-types for which disable_encoding_filter/1 holds.
 1006transfer_encoding_filter(Lines, In0, In) :-
 1007    transfer_encoding(Lines, Encoding),
 1008    !,
 1009    transfer_encoding_filter_(Encoding, In0, In).
 1010transfer_encoding_filter(Lines, In0, In) :-
 1011    content_encoding(Lines, Encoding),
 1012    content_type(Lines, Type),
 1013    \+ http:disable_encoding_filter(Type),
 1014    !,
 1015    transfer_encoding_filter_(Encoding, In0, In).
 1016transfer_encoding_filter(_, In, In).
 1017
 1018transfer_encoding_filter_(Encoding, In0, In) :-
 1019    stream_pair(In0, In1, Out),
 1020    (   nonvar(Out)
 1021    ->  close(Out)
 1022    ;   true
 1023    ),
 1024    (   http:encoding_filter(Encoding, In1, In)
 1025    ->  true
 1026    ;   autoload_encoding(Encoding),
 1027        http:encoding_filter(Encoding, In1, In)
 1028    ->  true
 1029    ;   domain_error(http_encoding, Encoding)
 1030    ).
 1031
 1032:- multifile
 1033    autoload_encoding/1. 1034
 1035:- if(exists_source(library(zlib))). 1036autoload_encoding(gzip) :-
 1037    use_module(library(zlib)).
 1038:- endif. 1039
 1040content_type(Lines, Type) :-
 1041    member(Line, Lines),
 1042    phrase(field('content-type'), Line, Rest),
 1043    !,
 1044    atom_codes(Type, Rest).
 http:disable_encoding_filter(+ContentType) is semidet
Do not use the Content-encoding as Transfer-encoding encoding for specific values of ContentType. This predicate is multifile and can thus be extended by the user.
 1052http:disable_encoding_filter('application/x-gzip').
 1053http:disable_encoding_filter('application/x-tar').
 1054http:disable_encoding_filter('x-world/x-vrml').
 1055http:disable_encoding_filter('application/zip').
 1056http:disable_encoding_filter('application/x-gzip').
 1057http:disable_encoding_filter('application/x-zip-compressed').
 1058http:disable_encoding_filter('application/x-compress').
 1059http:disable_encoding_filter('application/x-compressed').
 1060http:disable_encoding_filter('application/x-spoon').
 transfer_encoding(+Lines, -Encoding) is semidet
True if Encoding is the value of the Transfer-encoding header.
 1067transfer_encoding(Lines, Encoding) :-
 1068    what_encoding(transfer_encoding, Lines, Encoding).
 1069
 1070what_encoding(What, Lines, Encoding) :-
 1071    member(Line, Lines),
 1072    phrase(encoding_(What, Debug), Line, Rest),
 1073    !,
 1074    atom_codes(Encoding, Rest),
 1075    debug(http(What), '~w: ~p', [Debug, Rest]).
 1076
 1077encoding_(content_encoding, 'Content-encoding') -->
 1078    field('content-encoding').
 1079encoding_(transfer_encoding, 'Transfer-encoding') -->
 1080    field('transfer-encoding').
 content_encoding(+Lines, -Encoding) is semidet
True if Encoding is the value of the Content-encoding header.
 1087content_encoding(Lines, Encoding) :-
 1088    what_encoding(content_encoding, Lines, Encoding).
 read_header(+In:istream, +Parts, -Version, -Code:int, -Comment:atom, -Lines:list) is det
Read the HTTP reply-header. If the reply is completely empty an existence error is thrown. If the replied header is otherwise invalid a 500 HTTP error is simulated, having the comment Invalid reply header.
Arguments:
Parts- A list of compound terms that describe the parsed request URI.
Version- HTTP reply version as Major-Minor pair
Code- Numeric HTTP reply-code
Comment- Comment of reply-code as atom
Lines- Remaining header lines as code-lists.
Errors
- existence_error(http_reply, Uri)
 1107read_header(In, Parts, Major-Minor, Code, Comment, Lines) :-
 1108    read_line_to_codes(In, Line),
 1109    (   Line == end_of_file
 1110    ->  parts_uri(Parts, Uri),
 1111        existence_error(http_reply,Uri)
 1112    ;   true
 1113    ),
 1114    Line \== end_of_file,
 1115    phrase(first_line(Major-Minor, Code, Comment), Line),
 1116    debug(http(open), 'HTTP/~d.~d ~w ~w', [Major, Minor, Code, Comment]),
 1117    read_line_to_codes(In, Line2),
 1118    rest_header(Line2, In, Lines),
 1119    !,
 1120    (   debugging(http(open))
 1121    ->  forall(member(HL, Lines),
 1122               debug(http(open), '~s', [HL]))
 1123    ;   true
 1124    ).
 1125read_header(_, _, 1-1, 500, 'Invalid reply header', []).
 1126
 1127rest_header([], _, []) :- !.            % blank line: end of header
 1128rest_header(L0, In, [L0|L]) :-
 1129    read_line_to_codes(In, L1),
 1130    rest_header(L1, In, L).
 content_length(+Header, -Length:int) is semidet
Find the Content-Length in an HTTP reply-header.
 1136content_length(Lines, Length) :-
 1137    member(Line, Lines),
 1138    phrase(content_length(Length0), Line),
 1139    !,
 1140    Length = Length0.
 1141
 1142location(Lines, RequestURI) :-
 1143    member(Line, Lines),
 1144    phrase(atom_field(location, RequestURI), Line),
 1145    !.
 1146
 1147connection(Lines, Connection) :-
 1148    member(Line, Lines),
 1149    phrase(atom_field(connection, Connection0), Line),
 1150    !,
 1151    Connection = Connection0.
 1152
 1153first_line(Major-Minor, Code, Comment) -->
 1154    "HTTP/", integer(Major), ".", integer(Minor),
 1155    skip_blanks,
 1156    integer(Code),
 1157    skip_blanks,
 1158    rest(Comment).
 1159
 1160atom_field(Name, Value) -->
 1161    field(Name),
 1162    rest(Value).
 1163
 1164content_length(Len) -->
 1165    field('content-length'),
 1166    integer(Len).
 1167
 1168field(Name) -->
 1169    { atom_codes(Name, Codes) },
 1170    field_codes(Codes).
 1171
 1172field_codes([]) -->
 1173    ":",
 1174    skip_blanks.
 1175field_codes([H|T]) -->
 1176    [C],
 1177    { match_header_char(H, C)
 1178    },
 1179    field_codes(T).
 1180
 1181match_header_char(C, C) :- !.
 1182match_header_char(C, U) :-
 1183    code_type(C, to_lower(U)),
 1184    !.
 1185match_header_char(0'_, 0'-).
 1186
 1187
 1188skip_blanks -->
 1189    [C],
 1190    { code_type(C, white)
 1191    },
 1192    !,
 1193    skip_blanks.
 1194skip_blanks -->
 1195    [].
 integer(-Int)//
Read 1 or more digits and return as integer.
 1201integer(Code) -->
 1202    digit(D0),
 1203    digits(D),
 1204    { number_codes(Code, [D0|D])
 1205    }.
 1206
 1207digit(C) -->
 1208    [C],
 1209    { code_type(C, digit)
 1210    }.
 1211
 1212digits([D0|D]) -->
 1213    digit(D0),
 1214    !,
 1215    digits(D).
 1216digits([]) -->
 1217    [].
 rest(-Atom:atom)//
Get rest of input as an atom.
 1223rest(Atom) --> call(rest_(Atom)).
 1224
 1225rest_(Atom, L, []) :-
 1226    atom_codes(Atom, L).
 1227
 1228
 1229                 /*******************************
 1230                 *   AUTHORIZATION MANAGEMENT   *
 1231                 *******************************/
 http_set_authorization(+URL, +Authorization) is det
Set user/password to supply with URLs that have URL as prefix. If Authorization is the atom -, possibly defined authorization is cleared. For example:
?- http_set_authorization('http://www.example.com/private/',
                          basic('John', 'Secret'))
To be done
- Move to a separate module, so http_get/3, etc. can use this too.
 1247:- dynamic
 1248    stored_authorization/2,
 1249    cached_authorization/2. 1250
 1251http_set_authorization(URL, Authorization) :-
 1252    must_be(atom, URL),
 1253    retractall(stored_authorization(URL, _)),
 1254    (   Authorization = (-)
 1255    ->  true
 1256    ;   check_authorization(Authorization),
 1257        assert(stored_authorization(URL, Authorization))
 1258    ),
 1259    retractall(cached_authorization(_,_)).
 1260
 1261check_authorization(Var) :-
 1262    var(Var),
 1263    !,
 1264    instantiation_error(Var).
 1265check_authorization(basic(User, Password)) :-
 1266    must_be(atom, User),
 1267    must_be(text, Password).
 1268check_authorization(digest(User, Password)) :-
 1269    must_be(atom, User),
 1270    must_be(text, Password).
 authorization(+URL, -Authorization) is semidet
True if Authorization must be supplied for URL.
To be done
- Cleanup cache if it gets too big.
 1278authorization(_, _) :-
 1279    \+ stored_authorization(_, _),
 1280    !,
 1281    fail.
 1282authorization(URL, Authorization) :-
 1283    cached_authorization(URL, Authorization),
 1284    !,
 1285    Authorization \== (-).
 1286authorization(URL, Authorization) :-
 1287    (   stored_authorization(Prefix, Authorization),
 1288        sub_atom(URL, 0, _, _, Prefix)
 1289    ->  assert(cached_authorization(URL, Authorization))
 1290    ;   assert(cached_authorization(URL, -)),
 1291        fail
 1292    ).
 1293
 1294add_authorization(_, Options, Options) :-
 1295    option(authorization(_), Options),
 1296    !.
 1297add_authorization(Parts, Options0, Options) :-
 1298    url_part(user(User), Parts),
 1299    url_part(password(Passwd), Parts),
 1300    !,
 1301    Options = [authorization(basic(User,Passwd))|Options0].
 1302add_authorization(Parts, Options0, Options) :-
 1303    stored_authorization(_, _) ->   % quick test to avoid work
 1304    parts_uri(Parts, URL),
 1305    authorization(URL, Auth),
 1306    !,
 1307    Options = [authorization(Auth)|Options0].
 1308add_authorization(_, Options, Options).
 parse_url_ex(+URL, -Parts)
Parts: Scheme, Host, Port, User:Password, RequestURI (no fragment).
 1316parse_url_ex(URL, [uri(URL)|Parts]) :-
 1317    uri_components(URL, Components),
 1318    phrase(components(Components), Parts),
 1319    (   option(host(_), Parts)
 1320    ->  true
 1321    ;   domain_error(url, URL)
 1322    ).
 1323
 1324components(Components) -->
 1325    uri_scheme(Components),
 1326    uri_path(Components),
 1327    uri_authority(Components),
 1328    uri_request_uri(Components).
 1329
 1330uri_scheme(Components) -->
 1331    { uri_data(scheme, Components, Scheme), nonvar(Scheme) },
 1332    !,
 1333    [ scheme(Scheme)
 1334    ].
 1335uri_scheme(_) --> [].
 1336
 1337uri_path(Components) -->
 1338    { uri_data(path, Components, Path0), nonvar(Path0),
 1339      (   Path0 == ''
 1340      ->  Path = (/)
 1341      ;   Path = Path0
 1342      )
 1343    },
 1344    !,
 1345    [ path(Path)
 1346    ].
 1347uri_path(_) --> [].
 1348
 1349uri_authority(Components) -->
 1350    { uri_data(authority, Components, Auth), nonvar(Auth),
 1351      !,
 1352      uri_authority_components(Auth, Data)
 1353    },
 1354    [ authority(Auth) ],
 1355    auth_field(user, Data),
 1356    auth_field(password, Data),
 1357    auth_field(host, Data),
 1358    auth_field(port, Data).
 1359uri_authority(_) --> [].
 1360
 1361auth_field(Field, Data) -->
 1362    { uri_authority_data(Field, Data, EncValue), nonvar(EncValue),
 1363      !,
 1364      (   atom(EncValue)
 1365      ->  uri_encoded(query_value, Value, EncValue)
 1366      ;   Value = EncValue
 1367      ),
 1368      Part =.. [Field,Value]
 1369    },
 1370    [ Part ].
 1371auth_field(_, _) --> [].
 1372
 1373uri_request_uri(Components) -->
 1374    { uri_data(path, Components, Path0),
 1375      uri_data(search, Components, Search),
 1376      (   Path0 == ''
 1377      ->  Path = (/)
 1378      ;   Path = Path0
 1379      ),
 1380      uri_data(path, Components2, Path),
 1381      uri_data(search, Components2, Search),
 1382      uri_components(RequestURI, Components2)
 1383    },
 1384    [ request_uri(RequestURI)
 1385    ].
 parts_scheme(+Parts, -Scheme) is det
 parts_uri(+Parts, -URI) is det
 parts_request_uri(+Parts, -RequestURI) is det
 parts_search(+Parts, -Search) is det
 parts_authority(+Parts, -Authority) is semidet
 1393parts_scheme(Parts, Scheme) :-
 1394    url_part(scheme(Scheme), Parts),
 1395    !.
 1396parts_scheme(Parts, Scheme) :-          % compatibility with library(url)
 1397    url_part(protocol(Scheme), Parts),
 1398    !.
 1399parts_scheme(_, http).
 1400
 1401parts_authority(Parts, Auth) :-
 1402    url_part(authority(Auth), Parts),
 1403    !.
 1404parts_authority(Parts, Auth) :-
 1405    url_part(host(Host), Parts, _),
 1406    url_part(port(Port), Parts, _),
 1407    url_part(user(User), Parts, _),
 1408    url_part(password(Password), Parts, _),
 1409    uri_authority_components(Auth,
 1410                             uri_authority(User, Password, Host, Port)).
 1411
 1412parts_request_uri(Parts, RequestURI) :-
 1413    option(request_uri(RequestURI), Parts),
 1414    !.
 1415parts_request_uri(Parts, RequestURI) :-
 1416    url_part(path(Path), Parts, /),
 1417    ignore(parts_search(Parts, Search)),
 1418    uri_data(path, Data, Path),
 1419    uri_data(search, Data, Search),
 1420    uri_components(RequestURI, Data).
 1421
 1422parts_search(Parts, Search) :-
 1423    option(query_string(Search), Parts),
 1424    !.
 1425parts_search(Parts, Search) :-
 1426    option(search(Fields), Parts),
 1427    !,
 1428    uri_query_components(Search, Fields).
 1429
 1430
 1431parts_uri(Parts, URI) :-
 1432    option(uri(URI), Parts),
 1433    !.
 1434parts_uri(Parts, URI) :-
 1435    parts_scheme(Parts, Scheme),
 1436    ignore(parts_authority(Parts, Auth)),
 1437    parts_request_uri(Parts, RequestURI),
 1438    uri_components(RequestURI, Data),
 1439    uri_data(scheme, Data, Scheme),
 1440    uri_data(authority, Data, Auth),
 1441    uri_components(URI, Data).
 1442
 1443parts_port(Parts, Port) :-
 1444    parts_scheme(Parts, Scheme),
 1445    default_port(Scheme, DefPort),
 1446    url_part(port(Port), Parts, DefPort).
 1447
 1448url_part(Part, Parts) :-
 1449    Part =.. [Name,Value],
 1450    Gen =.. [Name,RawValue],
 1451    option(Gen, Parts),
 1452    !,
 1453    Value = RawValue.
 1454
 1455url_part(Part, Parts, Default) :-
 1456    Part =.. [Name,Value],
 1457    Gen =.. [Name,RawValue],
 1458    (   option(Gen, Parts)
 1459    ->  Value = RawValue
 1460    ;   Value = Default
 1461    ).
 1462
 1463
 1464                 /*******************************
 1465                 *            COOKIES           *
 1466                 *******************************/
 1467
 1468write_cookies(Out, Parts, Options) :-
 1469    http:write_cookies(Out, Parts, Options),
 1470    !.
 1471write_cookies(_, _, _).
 1472
 1473update_cookies(_, _, _) :-
 1474    predicate_property(http:update_cookies(_,_,_), number_of_clauses(0)),
 1475    !.
 1476update_cookies(Lines, Parts, Options) :-
 1477    (   member(Line, Lines),
 1478        phrase(atom_field('set_cookie', CookieData), Line),
 1479        http:update_cookies(CookieData, Parts, Options),
 1480        fail
 1481    ;   true
 1482    ).
 1483
 1484
 1485                 /*******************************
 1486                 *           OPEN ANY           *
 1487                 *******************************/
 1488
 1489:- multifile iostream:open_hook/6.
 iostream:open_hook(+Spec, +Mode, -Stream, -Close, +Options0, -Options) is semidet
Hook implementation that makes open_any/5 support http and https URLs for Mode == read.
 1497iostream:open_hook(URL, read, Stream, Close, Options0, Options) :-
 1498    (atom(URL) -> true ; string(URL)),
 1499    uri_is_global(URL),
 1500    uri_components(URL, Components),
 1501    uri_data(scheme, Components, Scheme),
 1502    http_scheme(Scheme),
 1503    !,
 1504    Options = Options0,
 1505    Close = close(Stream),
 1506    http_open(URL, Stream, Options0).
 1507
 1508http_scheme(http).
 1509http_scheme(https).
 1510
 1511
 1512                 /*******************************
 1513                 *          KEEP-ALIVE          *
 1514                 *******************************/
 consider_keep_alive(+HeaderLines, +Parts, +Host, +Stream0, -Stream, +Options) is det
 1520consider_keep_alive(Lines, Parts, Host, StreamPair, In, Options) :-
 1521    option(connection(Asked), Options),
 1522    keep_alive(Asked),
 1523    connection(Lines, Given),
 1524    keep_alive(Given),
 1525    content_length(Lines, Bytes),
 1526    !,
 1527    stream_pair(StreamPair, In0, _),
 1528    connection_address(Host, Parts, HostPort),
 1529    debug(http(connection),
 1530          'Keep-alive to ~w (~D bytes)', [HostPort, Bytes]),
 1531    stream_range_open(In0, In,
 1532                      [ size(Bytes),
 1533                        onclose(keep_alive(StreamPair, HostPort))
 1534                      ]).
 1535consider_keep_alive(_, _, _, Stream, Stream, _).
 1536
 1537connection_address(Host, _, Host) :-
 1538    Host = _:_,
 1539    !.
 1540connection_address(Host, Parts, Host:Port) :-
 1541    parts_port(Parts, Port).
 1542
 1543keep_alive(keep_alive) :- !.
 1544keep_alive(Connection) :-
 1545    downcase_atom(Connection, 'keep-alive').
 1546
 1547:- public keep_alive/4. 1548
 1549keep_alive(StreamPair, Host, _In, 0) :-
 1550    !,
 1551    debug(http(connection), 'Adding connection to ~p to pool', [Host]),
 1552    add_to_pool(Host, StreamPair).
 1553keep_alive(StreamPair, Host, In, Left) :-
 1554    Left < 100,
 1555    debug(http(connection), 'Reading ~D left bytes', [Left]),
 1556    read_incomplete(In, Left),
 1557    add_to_pool(Host, StreamPair),
 1558    !.
 1559keep_alive(StreamPair, _, _, _) :-
 1560    debug(http(connection),
 1561          'Closing connection due to excessive unprocessed input', []),
 1562    (   debugging(http(connection))
 1563    ->  catch(close(StreamPair), E,
 1564              print_message(warning, E))
 1565    ;   close(StreamPair, [force(true)])
 1566    ).
 read_incomplete(+In, +Left) is semidet
If we have not all input from a Keep-alive connection, read the remainder if it is short. Else, we fail and close the stream.
 1573read_incomplete(In, Left) :-
 1574    catch(setup_call_cleanup(
 1575              open_null_stream(Null),
 1576              copy_stream_data(In, Null, Left),
 1577              close(Null)),
 1578          _,
 1579          fail).
 1580
 1581:- dynamic
 1582    connection_pool/4,              % Hash, Address, Stream, Time
 1583    connection_gc_time/1. 1584
 1585add_to_pool(Address, StreamPair) :-
 1586    keep_connection(Address),
 1587    get_time(Now),
 1588    term_hash(Address, Hash),
 1589    assertz(connection_pool(Hash, Address, StreamPair, Now)).
 1590
 1591get_from_pool(Address, StreamPair) :-
 1592    term_hash(Address, Hash),
 1593    retract(connection_pool(Hash, Address, StreamPair, _)).
 keep_connection(+Address) is semidet
Succeeds if we want to keep the connection open. We currently keep a maximum of 10 connections waiting and a maximum of 2 waiting for the same address. Connections older than 2 seconds are closed.
 1602keep_connection(Address) :-
 1603    close_old_connections(2),
 1604    predicate_property(connection_pool(_,_,_,_), number_of_clauses(C)),
 1605    C =< 10,
 1606    term_hash(Address, Hash),
 1607    aggregate_all(count, connection_pool(Hash, Address, _, _), Count),
 1608    Count =< 2.
 1609
 1610close_old_connections(Timeout) :-
 1611    get_time(Now),
 1612    Before is Now - Timeout,
 1613    (   connection_gc_time(GC),
 1614        GC > Before
 1615    ->  true
 1616    ;   (   retractall(connection_gc_time(_)),
 1617            asserta(connection_gc_time(Now)),
 1618            connection_pool(Hash, Address, StreamPair, Added),
 1619            Added < Before,
 1620            retract(connection_pool(Hash, Address, StreamPair, Added)),
 1621            debug(http(connection),
 1622                  'Closing inactive keep-alive to ~p', [Address]),
 1623            close(StreamPair, [force(true)]),
 1624            fail
 1625        ;   true
 1626        )
 1627    ).
 http_close_keep_alive(+Address) is det
Close all keep-alive connections matching Address. Address is of the form Host:Port. In particular, http_close_keep_alive(_) closes all currently known keep-alive connections.
 1636http_close_keep_alive(Address) :-
 1637    forall(get_from_pool(Address, StreamPair),
 1638           close(StreamPair, [force(true)])).
 keep_alive_error(+Error)
Deal with an error from reusing a keep-alive connection. If the error is due to an I/O error or end-of-file, fail to backtrack over get_from_pool/2. Otherwise it is a real error and we thus re-raise it.
 1647keep_alive_error(keep_alive(closed)) :-
 1648    !,
 1649    debug(http(connection), 'Keep-alive connection was closed', []),
 1650    fail.
 1651keep_alive_error(io_error(_,_)) :-
 1652    !,
 1653    debug(http(connection), 'IO error on Keep-alive connection', []),
 1654    fail.
 1655keep_alive_error(Error) :-
 1656    throw(Error).
 1657
 1658
 1659                 /*******************************
 1660                 *     HOOK DOCUMENTATION       *
 1661                 *******************************/
 http:open_options(+Parts, -Options) is nondet
This hook is used by the HTTP client library to define default options based on the the broken-down request-URL. The following example redirects all trafic, except for localhost over a proxy:
:- multifile
    http:open_options/2.

http:open_options(Parts, Options) :-
    option(host(Host), Parts),
    Host \== localhost,
    Options = [proxy('proxy.local', 3128)].

This hook may return multiple solutions. The returned options are combined using merge_options/3 where earlier solutions overrule later solutions.

 http:write_cookies(+Out, +Parts, +Options) is semidet
Emit a Cookie: header for the current connection. Out is an open stream to the HTTP server, Parts is the broken-down request (see uri_components/2) and Options is the list of options passed to http_open. The predicate is called as if using ignore/1.
See also
- complements http:update_cookies/3.
- library(http/http_cookie) implements cookie handling on top of these hooks.
 http:update_cookies(+CookieData, +Parts, +Options) is semidet
Update the cookie database. CookieData is the value of the Set-Cookie field, Parts is the broken-down request (see uri_components/2) and Options is the list of options passed to http_open.
See also
- complements http:write_cookies
- library(http/http_cookies) implements cookie handling on top of these hooks.