View source with raw comments or as raw
    1/*  Part of SWI-Prolog
    2
    3    Author:        Jan Wielemaker
    4    E-mail:        J.Wielemaker@vu.nl
    5    WWW:           http://www.swi-prolog.org
    6    Copyright (c)  2002-2020, University of Amsterdam
    7                              VU University Amsterdam
    8                              CWI, Amsterdam
    9    All rights reserved.
   10
   11    Redistribution and use in source and binary forms, with or without
   12    modification, are permitted provided that the following conditions
   13    are met:
   14
   15    1. Redistributions of source code must retain the above copyright
   16       notice, this list of conditions and the following disclaimer.
   17
   18    2. Redistributions in binary form must reproduce the above copyright
   19       notice, this list of conditions and the following disclaimer in
   20       the documentation and/or other materials provided with the
   21       distribution.
   22
   23    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
   24    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
   25    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
   26    FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
   27    COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
   28    INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
   29    BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
   30    LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
   31    CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
   32    LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
   33    ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
   34    POSSIBILITY OF SUCH DAMAGE.
   35*/
   36
   37:- module(http_open,
   38          [ http_open/3,                % +URL, -Stream, +Options
   39            http_set_authorization/2,   % +URL, +Authorization
   40            http_close_keep_alive/1     % +Address
   41          ]).   42:- autoload(library(aggregate),[aggregate_all/3]).   43:- autoload(library(apply),[foldl/4,include/3]).   44:- autoload(library(base64),[base64/3]).   45:- autoload(library(debug),[debug/3,debugging/1]).   46:- autoload(library(error),
   47	    [ domain_error/2, must_be/2, existence_error/2, instantiation_error/1
   48	    ]).   49:- autoload(library(lists),[last/2,member/2]).   50:- autoload(library(option),
   51	    [ meta_options/3, option/2, select_option/4, merge_options/3,
   52	      option/3, select_option/3
   53	    ]).   54:- autoload(library(readutil),[read_line_to_codes/2]).   55:- autoload(library(socket),
   56	    [tcp_connect/3,negotiate_socks_connection/2]).   57:- autoload(library(uri),
   58	    [ uri_resolve/3, uri_components/2, uri_data/3,
   59              uri_authority_components/2, uri_authority_data/3,
   60	      uri_encoded/3, uri_query_components/2, uri_is_global/1
   61	    ]).   62:- autoload(library(http/http_header),
   63            [ http_parse_header/2, http_post_data/3 ]).   64:- autoload(library(http/http_stream),[stream_range_open/3]).   65:- if(exists_source(library(ssl))).   66:- autoload(library(ssl), [ssl_upgrade_legacy_options/2]).   67:- endif.

HTTP client library

This library defines http_open/3, which opens a URL as a Prolog stream. The functionality of the library can be extended by loading two additional modules that act as plugins:

library(http/http_ssl_plugin)
Loading this library causes http_open/3 to handle HTTPS connections. Relevant options for SSL certificate handling are handed to ssl_context/3. This plugin is loaded automatically if the scheme https is requested using a default SSL context. See the plugin for additional information regarding security.
library(zlib)
Loading this library supports the gzip transfer encoding. This plugin is lazily loaded if a connection is opened that claims this transfer encoding.
library(http/http_cookie)
Loading this library adds tracking cookies to http_open/3. Returned cookies are collected in the Prolog database and supplied for subsequent requests.
library(http/http_stream)
This library adds support for chunked encoding and makes the http_open/3 advertise itself as HTTP/1.1 instead of HTTP/1.0.

Here is a simple example to fetch a web-page:

?- http_open('http://www.google.com/search?q=prolog', In, []),
   copy_stream_data(In, user_output),
   close(In).
<!doctype html><head><title>prolog - Google Search</title><script>
...

The example below fetches the modification time of a web-page. Note that Modified is '' (the empty atom) if the web-server does not provide a time-stamp for the resource. See also parse_time/2.

modified(URL, Stamp) :-
        http_open(URL, In,
                  [ method(head),
                    header(last_modified, Modified)
                  ]),
        close(In),
        Modified \== '',
        parse_time(Modified, Stamp).

Then next example uses Google search. It exploits library(uri) to manage URIs, library(sgml) to load an HTML document and library(xpath) to navigate the parsed HTML. Note that you may need to adjust the XPath queries if the data returned by Google changes.

:- use_module(library(http/http_open)).
:- use_module(library(xpath)).
:- use_module(library(sgml)).
:- use_module(library(uri)).

google(For, Title, HREF) :-
        uri_encoded(query_value, For, Encoded),
        atom_concat('http://www.google.com/search?q=', Encoded, URL),
        http_open(URL, In, []),
        call_cleanup(
            load_html(In, DOM, []),
            close(In)),
        xpath(DOM, //h3(@class=r), Result),
        xpath(Result, //a(@href=HREF0, text), Title),
        uri_components(HREF0, Components),
        uri_data(search, Components, Query),
        uri_query_components(Query, Parts),
        memberchk(q=HREF, Parts).

An example query is below:

?- google(prolog, Title, HREF).
Title = 'SWI-Prolog',
HREF = 'http://www.swi-prolog.org/' ;
Title = 'Prolog - Wikipedia',
HREF = 'https://nl.wikipedia.org/wiki/Prolog' ;
Title = 'Prolog - Wikipedia, the free encyclopedia',
HREF = 'https://en.wikipedia.org/wiki/Prolog' ;
Title = 'Pro-Log is logistiek dienstverlener m.b.t. vervoer over water.',
HREF = 'http://www.pro-log.nl/' ;
Title = 'Learn Prolog Now!',
HREF = 'http://www.learnprolognow.org/' ;
Title = 'Free Online Version - Learn Prolog
...
See also
- load_html/3 and xpath/3 can be used to parse and navigate HTML documents.
- http_get/3 and http_post/4 provide an alternative interface that convert the reply depending on the Content-Type header. */
  172:- multifile
  173    http:encoding_filter/3,           % +Encoding, +In0, -In
  174    http:current_transfer_encoding/1, % ?Encoding
  175    http:disable_encoding_filter/1,   % +ContentType
  176    http:http_protocol_hook/5,        % +Protocol, +Parts, +StreamPair,
  177                                      % -NewStreamPair, +Options
  178    http:open_options/2,              % +Parts, -Options
  179    http:write_cookies/3,             % +Out, +Parts, +Options
  180    http:update_cookies/3,            % +CookieLine, +Parts, +Options
  181    http:authenticate_client/2,       % +URL, +Action
  182    http:http_connection_over_proxy/6.  183
  184:- meta_predicate
  185    http_open(+,-,:).  186
  187:- predicate_options(http_open/3, 3,
  188                     [ authorization(compound),
  189                       final_url(-atom),
  190                       header(+atom, -atom),
  191                       headers(-list),
  192                       connection(+atom),
  193                       method(oneof([delete,get,put,head,post,patch,options])),
  194                       size(-integer),
  195                       status_code(-integer),
  196                       output(-stream),
  197                       timeout(number),
  198                       proxy(atom, integer),
  199                       proxy_authorization(compound),
  200                       bypass_proxy(boolean),
  201                       request_header(any),
  202                       user_agent(atom),
  203                       version(-compound),
  204        % The option below applies if library(http/http_header) is loaded
  205                       post(any),
  206        % The options below apply if library(http/http_ssl_plugin)) is loaded
  207                       pem_password_hook(callable),
  208                       cacert_file(atom),
  209                       cert_verify_hook(callable)
  210                     ]).
 user_agent(-Agent) is det
Default value for User-Agent, can be overruled using the option user_agent(Agent) of http_open/3.
  217user_agent('SWI-Prolog').
 http_open(+URL, -Stream, +Options) is det
Open the data at the HTTP server as a Prolog stream. URL is either an atom specifying a URL or a list representing a broken-down URL as specified below. After this predicate succeeds the data can be read from Stream. After completion this stream must be closed using the built-in Prolog predicate close/1. Options provides additional options:
authenticate(+Boolean)
If false (default true), do not try to automatically authenticate the client if a 401 (Unauthorized) status code is received.
authorization(+Term)
Send authorization. See also http_set_authorization/2. Supported schemes:
basic(+User, +Password)
HTTP Basic authentication.
bearer(+Token)
HTTP Bearer authentication.
digest(+User, +Password)
HTTP Digest authentication. This option is only provided if the plugin library(http/http_digest) is also loaded.
connection(+Connection)
Specify the Connection header. Default is close. The alternative is Keep-alive. This maintains a pool of available connections as determined by keep_connection/1. The library(http/websockets) uses Keep-alive, Upgrade. Keep-alive connections can be closed explicitly using http_close_keep_alive/1. Keep-alive connections may significantly improve repetitive requests on the same server, especially if the IP route is long, HTTPS is used or the connection uses a proxy.
final_url(-FinalURL)
Unify FinalURL with the final destination. This differs from the original URL if the returned head of the original indicates an HTTP redirect (codes 301, 302 or 303). Without a redirect, FinalURL is the same as URL if URL is an atom, or a URL constructed from the parts.
header(Name, -AtomValue)
If provided, AtomValue is unified with the value of the indicated field in the reply header. Name is matched case-insensitive and the underscore (_) matches the hyphen (-). Multiple of these options may be provided to extract multiple header fields. If the header is not available AtomValue is unified to the empty atom ('').
headers(-List)
If provided, List is unified with a list of Name(Value) pairs corresponding to fields in the reply header. Name and Value follow the same conventions used by the header(Name,Value) option.
method(+Method)
One of get (default), head, delete, post, put or patch. The head message can be used in combination with the header(Name, Value) option to access information on the resource without actually fetching the resource itself. The returned stream must be closed immediately.

If post(Data) is provided, the default is post.

size(-Size)
Size is unified with the integer value of Content-Length in the reply header.
version(-Version)
Version is a pair Major-Minor, where Major and Minor are integers representing the HTTP version in the reply header.
range(+Range)
Ask for partial content. Range is a term Unit(From,To), where From is an integer and To is either an integer or the atom end. HTTP 1.1 only supports Unit = bytes. E.g., to ask for bytes 1000-1999, use the option range(bytes(1000,1999))
redirect(+Boolean)
If false (default true), do not automatically redirect if a 3XX code is received. Must be combined with status_code(Code) and one of the header options to read the redirect reply. In particular, without status_code(Code) a redirect is mapped to an exception.
status_code(-Code)
If this option is present and Code unifies with the HTTP status code, do not translate errors (4xx, 5xx) into an exception. Instead, http_open/3 behaves as if 2xx (success) is returned, providing the application to read the error document from the returned stream.
output(-Out)
Unify the output stream with Out and do not close it. This can be used to upgrade a connection.
timeout(+Timeout)
If provided, set a timeout on the stream using set_stream/2. With this option if no new data arrives within Timeout seconds the stream raises an exception. Default is to wait forever (infinite).
post(+Data)
Issue a POST request on the HTTP server. Data is handed to http_post_data/3.
proxy(+Host:Port)
Use an HTTP proxy to connect to the outside world. See also socket:proxy_for_url/3. This option overrules the proxy specification defined by socket:proxy_for_url/3.
proxy(+Host, +Port)
Synonym for proxy(+Host:Port). Deprecated.
proxy_authorization(+Authorization)
Send authorization to the proxy. Otherwise the same as the authorization option.
bypass_proxy(+Boolean)
If true, bypass proxy hooks. Default is false.
request_header(Name=Value)
Additional name-value parts are added in the order of appearance to the HTTP request header. No interpretation is done.
max_redirect(+Max)
Sets the maximum length of a redirection chain. This is needed for some IRIs that redirect indefinitely to other IRIs without looping (e.g., redirecting to IRIs with a random element in them). Max must be either a non-negative integer or the atom infinite. The default value is 10.
user_agent(+Agent)
Defines the value of the User-Agent field of the HTTP header. Default is SWI-Prolog.

The hook http:open_options/2 can be used to provide default options based on the broken-down URL. The option status_code(-Code) is particularly useful to query REST interfaces that commonly return status codes other than 200 that need to be be processed by the client code.

Arguments:
URL- is either an atom or string (url) or a list of parts.

When provided, this list may contain the fields scheme, user, password, host, port, path and either query_string (whose argument is an atom) or search (whose argument is a list of Name(Value) or Name=Value compound terms). Only host is mandatory. The example below opens the URL http://www.example.com/my/path?q=Hello%20World&lang=en. Note that values must not be quoted because the library inserts the required quotes.

http_open([ host('www.example.com'),
            path('/my/path'),
            search([ q='Hello world',
                     lang=en
                   ])
          ])
throws
- error(existence_error(url, Id),Context) is raised if the HTTP result code is not in the range 200..299. Context has the shape context(Message, status(Code, TextCode)), where Code is the numeric HTTP code and TextCode is the textual description thereof provided by the server. Message may provide additional details or may be unbound.
See also
- ssl_context/3 for SSL related options if library(http/http_ssl_plugin) is loaded.
  399:- multifile
  400    socket:proxy_for_url/3.           % +URL, +Host, -ProxyList
  401
  402http_open(URL, Stream, QOptions) :-
  403    meta_options(is_meta, QOptions, Options0),
  404    (   atomic(URL)
  405    ->  parse_url_ex(URL, Parts)
  406    ;   Parts = URL
  407    ),
  408    autoload_https(Parts),
  409    upgrade_ssl_options(Parts, Options0, Options),
  410    add_authorization(Parts, Options, Options1),
  411    findall(HostOptions, hooked_options(Parts, HostOptions), AllHostOptions),
  412    foldl(merge_options_rev, AllHostOptions, Options1, Options2),
  413    (   option(bypass_proxy(true), Options)
  414    ->  try_http_proxy(direct, Parts, Stream, Options2)
  415    ;   term_variables(Options2, Vars2),
  416        findall(Result-Vars2,
  417                try_a_proxy(Parts, Result, Options2),
  418                ResultList),
  419        last(ResultList, Status-Vars2)
  420    ->  (   Status = true(_Proxy, Stream)
  421        ->  true
  422        ;   throw(error(proxy_error(tried(ResultList)), _))
  423        )
  424    ;   try_http_proxy(direct, Parts, Stream, Options2)
  425    ).
  426
  427try_a_proxy(Parts, Result, Options) :-
  428    parts_uri(Parts, AtomicURL),
  429    option(host(Host), Parts),
  430    (   (   option(proxy(ProxyHost:ProxyPort), Options)
  431        ;   is_list(Options),
  432            memberchk(proxy(ProxyHost,ProxyPort), Options)
  433        )
  434    ->  Proxy = proxy(ProxyHost, ProxyPort)
  435    ;   socket:proxy_for_url(AtomicURL, Host, Proxy)
  436    ),
  437    debug(http(proxy),
  438          'http_open: Connecting via ~w to ~w', [Proxy, AtomicURL]),
  439    (   catch(try_http_proxy(Proxy, Parts, Stream, Options), E, true)
  440    ->  (   var(E)
  441        ->  !, Result = true(Proxy, Stream)
  442        ;   Result = error(Proxy, E)
  443        )
  444    ;   Result = false(Proxy)
  445    ),
  446    debug(http(proxy), 'http_open: ~w: ~p', [Proxy, Result]).
  447
  448try_http_proxy(Method, Parts, Stream, Options0) :-
  449    option(host(Host), Parts),
  450    (   Method == direct
  451    ->  parts_request_uri(Parts, RequestURI)
  452    ;   parts_uri(Parts, RequestURI)
  453    ),
  454    select_option(visited(Visited0), Options0, OptionsV, []),
  455    Options = [visited([Parts|Visited0])|OptionsV],
  456    parts_scheme(Parts, Scheme),
  457    default_port(Scheme, DefPort),
  458    url_part(port(Port), Parts, DefPort),
  459    host_and_port(Host, DefPort, Port, HostPort),
  460    (   option(connection(Connection), Options0),
  461        keep_alive(Connection),
  462        get_from_pool(Host:Port, StreamPair),
  463        debug(http(connection), 'Trying Keep-alive to ~p using ~p',
  464              [ Host:Port, StreamPair ]),
  465        catch(send_rec_header(StreamPair, Stream, HostPort,
  466                              RequestURI, Parts, Options),
  467              error(E,_),
  468              keep_alive_error(E))
  469    ->  true
  470    ;   http:http_connection_over_proxy(Method, Parts, Host:Port,
  471                                        SocketStreamPair, Options, Options1),
  472        (   catch(http:http_protocol_hook(Scheme, Parts,
  473                                          SocketStreamPair,
  474                                          StreamPair, Options),
  475                  Error,
  476                  ( close(SocketStreamPair, [force(true)]),
  477                    throw(Error)))
  478        ->  true
  479        ;   StreamPair = SocketStreamPair
  480        ),
  481        send_rec_header(StreamPair, Stream, HostPort,
  482                        RequestURI, Parts, Options1)
  483    ),
  484    return_final_url(Options).
  485
  486http:http_connection_over_proxy(direct, _, Host:Port,
  487                                StreamPair, Options, Options) :-
  488    !,
  489    open_socket(Host:Port, StreamPair, Options).
  490http:http_connection_over_proxy(proxy(ProxyHost, ProxyPort), Parts, _,
  491                                StreamPair, Options, Options) :-
  492    \+ ( memberchk(scheme(Scheme), Parts),
  493         secure_scheme(Scheme)
  494       ),
  495    !,
  496    % We do not want any /more/ proxy after this
  497    open_socket(ProxyHost:ProxyPort, StreamPair,
  498                [bypass_proxy(true)|Options]).
  499http:http_connection_over_proxy(socks(SocksHost, SocksPort), _Parts, Host:Port,
  500                                StreamPair, Options, Options) :-
  501    !,
  502    tcp_connect(SocksHost:SocksPort, StreamPair, [bypass_proxy(true)]),
  503    catch(negotiate_socks_connection(Host:Port, StreamPair),
  504          Error,
  505          ( close(StreamPair, [force(true)]),
  506            throw(Error)
  507          )).
 hooked_options(+Parts, -Options) is nondet
Calls http:open_options/2 and if necessary upgrades old SSL cacerts_file(File) option to a cacerts(List) option to ensure proper merging of options.
  515hooked_options(Parts, Options) :-
  516    http:open_options(Parts, Options0),
  517    upgrade_ssl_options(Parts, Options0, Options).
  518
  519:- if(current_predicate(ssl_upgrade_legacy_options/2)).  520upgrade_ssl_options(Parts, Options0, Options) :-
  521    requires_ssl(Parts),
  522    !,
  523    ssl_upgrade_legacy_options(Options0, Options).
  524:- endif.  525upgrade_ssl_options(_, Options, Options).
  526
  527merge_options_rev(Old, New, Merged) :-
  528    merge_options(New, Old, Merged).
  529
  530is_meta(pem_password_hook).             % SSL plugin callbacks
  531is_meta(cert_verify_hook).
  532
  533
  534http:http_protocol_hook(http, _, StreamPair, StreamPair, _).
  535
  536default_port(https, 443) :- !.
  537default_port(wss,   443) :- !.
  538default_port(_,     80).
  539
  540host_and_port(Host, DefPort, DefPort, Host) :- !.
  541host_and_port(Host, _,       Port,    Host:Port).
 autoload_https(+Parts) is det
If the requested scheme is https or wss, load the HTTPS plugin.
  547autoload_https(Parts) :-
  548    requires_ssl(Parts),
  549    memberchk(scheme(S), Parts),
  550    \+ clause(http:http_protocol_hook(S, _, StreamPair, StreamPair, _),_),
  551    exists_source(library(http/http_ssl_plugin)),
  552    !,
  553    use_module(library(http/http_ssl_plugin)).
  554autoload_https(_).
  555
  556requires_ssl(Parts) :-
  557    memberchk(scheme(S), Parts),
  558    secure_scheme(S).
  559
  560secure_scheme(https).
  561secure_scheme(wss).
 send_rec_header(+StreamPair, -Stream, +Host, +RequestURI, +Parts, +Options) is det
Send header to Out and process reply. If there is an error or failure, close In and Out and return the error or failure.
  569send_rec_header(StreamPair, Stream, Host, RequestURI, Parts, Options) :-
  570    (   catch(guarded_send_rec_header(StreamPair, Stream,
  571                                      Host, RequestURI, Parts, Options),
  572              E, true)
  573    ->  (   var(E)
  574        ->  (   option(output(StreamPair), Options)
  575            ->  true
  576            ;   true
  577            )
  578        ;   close(StreamPair, [force(true)]),
  579            throw(E)
  580        )
  581    ;   close(StreamPair, [force(true)]),
  582        fail
  583    ).
  584
  585guarded_send_rec_header(StreamPair, Stream, Host, RequestURI, Parts, Options) :-
  586    user_agent(Agent, Options),
  587    method(Options, MNAME),
  588    http_version(Version),
  589    option(connection(Connection), Options, close),
  590    debug(http(send_request), "> ~w ~w HTTP/~w", [MNAME, RequestURI, Version]),
  591    debug(http(send_request), "> Host: ~w", [Host]),
  592    debug(http(send_request), "> User-Agent: ~w", [Agent]),
  593    debug(http(send_request), "> Connection: ~w", [Connection]),
  594    format(StreamPair,
  595           '~w ~w HTTP/~w\r\n\c
  596               Host: ~w\r\n\c
  597               User-Agent: ~w\r\n\c
  598               Connection: ~w\r\n',
  599           [MNAME, RequestURI, Version, Host, Agent, Connection]),
  600    parts_uri(Parts, URI),
  601    x_headers(Options, URI, StreamPair),
  602    write_cookies(StreamPair, Parts, Options),
  603    (   option(post(PostData), Options)
  604    ->  http_post_data(PostData, StreamPair, [])
  605    ;   format(StreamPair, '\r\n', [])
  606    ),
  607    flush_output(StreamPair),
  608                                    % read the reply header
  609    read_header(StreamPair, Parts, ReplyVersion, Code, Comment, Lines),
  610    update_cookies(Lines, Parts, Options),
  611    do_open(ReplyVersion, Code, Comment, Lines, Options, Parts, Host,
  612            StreamPair, Stream).
 http_version(-Version:atom) is det
HTTP version we publish. We can only use 1.1 if we support chunked encoding.
  620http_version('1.1') :-
  621    http:current_transfer_encoding(chunked),
  622    !.
  623http_version('1.0').
  624
  625method(Options, MNAME) :-
  626    option(post(_), Options),
  627    !,
  628    option(method(M), Options, post),
  629    (   map_method(M, MNAME0)
  630    ->  MNAME = MNAME0
  631    ;   domain_error(method, M)
  632    ).
  633method(Options, MNAME) :-
  634    option(method(M), Options, get),
  635    (   map_method(M, MNAME0)
  636    ->  MNAME = MNAME0
  637    ;   map_method(_, M)
  638    ->  MNAME = M
  639    ;   domain_error(method, M)
  640    ).
  641
  642map_method(delete,  'DELETE').
  643map_method(get,     'GET').
  644map_method(head,    'HEAD').
  645map_method(post,    'POST').
  646map_method(put,     'PUT').
  647map_method(patch,   'PATCH').
  648map_method(options, 'OPTIONS').
 x_headers(+Options, +URI, +Out) is det
Emit extra headers from request_header(Name=Value) options in Options.
To be done
- Use user/password fields
  657x_headers(Options, URI, Out) :-
  658    x_headers_(Options, [url(URI)|Options], Out).
  659
  660x_headers_([], _, _).
  661x_headers_([H|T], Options, Out) :-
  662    x_header(H, Options, Out),
  663    x_headers_(T, Options, Out).
  664
  665x_header(request_header(Name=Value), _, Out) :-
  666    !,
  667    debug(http(send_request), "> ~w: ~w", [Name, Value]),
  668    format(Out, '~w: ~w\r\n', [Name, Value]).
  669x_header(proxy_authorization(ProxyAuthorization), Options, Out) :-
  670    !,
  671    auth_header(ProxyAuthorization, Options, 'Proxy-Authorization', Out).
  672x_header(authorization(Authorization), Options, Out) :-
  673    !,
  674    auth_header(Authorization, Options, 'Authorization', Out).
  675x_header(range(Spec), _, Out) :-
  676    !,
  677    Spec =.. [Unit, From, To],
  678    (   To == end
  679    ->  ToT = ''
  680    ;   must_be(integer, To),
  681        ToT = To
  682    ),
  683    debug(http(send_request), "> Range: ~w=~d-~w", [Unit, From, ToT]),
  684    format(Out, 'Range: ~w=~d-~w\r\n', [Unit, From, ToT]).
  685x_header(_, _, _).
 auth_header(+AuthOption, +Options, +HeaderName, +Out)
  689auth_header(basic(User, Password), _, Header, Out) :-
  690    !,
  691    format(codes(Codes), '~w:~w', [User, Password]),
  692    phrase(base64(Codes), Base64Codes),
  693    debug(http(send_request), "> ~w: Basic ~s", [Header, Base64Codes]),
  694    format(Out, '~w: Basic ~s\r\n', [Header, Base64Codes]).
  695auth_header(bearer(Token), _, Header, Out) :-
  696    !,
  697    debug(http(send_request), "> ~w: Bearer ~w", [Header,Token]),
  698    format(Out, '~w: Bearer ~w\r\n', [Header, Token]).
  699auth_header(Auth, Options, _, Out) :-
  700    option(url(URL), Options),
  701    add_method(Options, Options1),
  702    http:authenticate_client(URL, send_auth_header(Auth, Out, Options1)),
  703    !.
  704auth_header(Auth, _, _, _) :-
  705    domain_error(authorization, Auth).
  706
  707user_agent(Agent, Options) :-
  708    (   option(user_agent(Agent), Options)
  709    ->  true
  710    ;   user_agent(Agent)
  711    ).
  712
  713add_method(Options0, Options) :-
  714    option(method(_), Options0),
  715    !,
  716    Options = Options0.
  717add_method(Options0, Options) :-
  718    option(post(_), Options0),
  719    !,
  720    Options = [method(post)|Options0].
  721add_method(Options0, [method(get)|Options0]).
 do_open(+HTTPVersion, +HTTPStatusCode, +HTTPStatusComment, +Header, +Options, +Parts, +Host, +In, -FinalIn) is det
Handle the HTTP status once available. If 200-299, we are ok. If a redirect, redo the open, returning a new stream. Else issue an error.
Errors
- existence_error(url, URL)
  732                                        % Redirections
  733do_open(_, Code, _, Lines, Options0, Parts, _, In, Stream) :-
  734    redirect_code(Code),
  735    option(redirect(true), Options0, true),
  736    location(Lines, RequestURI),
  737    !,
  738    debug(http(redirect), 'http_open: redirecting to ~w', [RequestURI]),
  739    close(In),
  740    parts_uri(Parts, Base),
  741    uri_resolve(RequestURI, Base, Redirected),
  742    parse_url_ex(Redirected, RedirectedParts),
  743    (   redirect_limit_exceeded(Options0, Max)
  744    ->  format(atom(Comment), 'max_redirect (~w) limit exceeded', [Max]),
  745        throw(error(permission_error(redirect, http, Redirected),
  746                    context(_, Comment)))
  747    ;   redirect_loop(RedirectedParts, Options0)
  748    ->  throw(error(permission_error(redirect, http, Redirected),
  749                    context(_, 'Redirection loop')))
  750    ;   true
  751    ),
  752    redirect_options(Options0, Options),
  753    http_open(RedirectedParts, Stream, Options).
  754                                        % Need authentication
  755do_open(_Version, Code, _Comment, Lines, Options0, Parts, _Host, In0, Stream) :-
  756    authenticate_code(Code),
  757    option(authenticate(true), Options0, true),
  758    parts_uri(Parts, URI),
  759    parse_headers(Lines, Headers),
  760    http:authenticate_client(
  761             URI,
  762             auth_reponse(Headers, Options0, Options)),
  763    !,
  764    close(In0),
  765    http_open(Parts, Stream, Options).
  766                                        % Accepted codes
  767do_open(Version, Code, _, Lines, Options, Parts, Host, In0, In) :-
  768    (   option(status_code(Code), Options),
  769        Lines \== []
  770    ->  true
  771    ;   successful_code(Code)
  772    ),
  773    !,
  774    parts_uri(Parts, URI),
  775    parse_headers(Lines, Headers),
  776    return_version(Options, Version),
  777    return_size(Options, Headers),
  778    return_fields(Options, Headers),
  779    return_headers(Options, Headers),
  780    consider_keep_alive(Lines, Parts, Host, In0, In1, Options),
  781    transfer_encoding_filter(Lines, In1, In),
  782                                    % properly re-initialise the stream
  783    set_stream(In, file_name(URI)),
  784    set_stream(In, record_position(true)).
  785do_open(_, _, _, [], Options, _, _, _, _) :-
  786    option(connection(Connection), Options),
  787    keep_alive(Connection),
  788    !,
  789    throw(error(keep_alive(closed),_)).
  790                                        % report anything else as error
  791do_open(_Version, Code, Comment, _,  _, Parts, _, _, _) :-
  792    parts_uri(Parts, URI),
  793    (   map_error_code(Code, Error)
  794    ->  Formal =.. [Error, url, URI]
  795    ;   Formal = existence_error(url, URI)
  796    ),
  797    throw(error(Formal, context(_, status(Code, Comment)))).
  798
  799
  800successful_code(Code) :-
  801    between(200, 299, Code).
 redirect_limit_exceeded(+Options:list(compound), -Max:nonneg) is semidet
True if we have exceeded the maximum redirection length (default 10).
  807redirect_limit_exceeded(Options, Max) :-
  808    option(visited(Visited), Options, []),
  809    length(Visited, N),
  810    option(max_redirect(Max), Options, 10),
  811    (Max == infinite -> fail ; N > Max).
 redirect_loop(+Parts, +Options) is semidet
True if we are in a redirection loop. Note that some sites redirect once to the same place using cookies or similar, so we allow for two tries. In fact, we should probably test whether authorization or cookie headers have changed.
  821redirect_loop(Parts, Options) :-
  822    option(visited(Visited), Options, []),
  823    include(==(Parts), Visited, Same),
  824    length(Same, Count),
  825    Count > 2.
 redirect_options(+Options0, -Options) is det
A redirect from a POST should do a GET on the returned URI. This means we must remove the method(post) and post(Data) options from the original option-list.
  834redirect_options(Options0, Options) :-
  835    (   select_option(post(_), Options0, Options1)
  836    ->  true
  837    ;   Options1 = Options0
  838    ),
  839    (   select_option(method(Method), Options1, Options),
  840        \+ redirect_method(Method)
  841    ->  true
  842    ;   Options = Options1
  843    ).
  844
  845redirect_method(delete).
  846redirect_method(get).
  847redirect_method(head).
 map_error_code(+HTTPCode, -PrologError) is semidet
Map HTTP error codes to Prolog errors.
To be done
- Many more maps. Unfortunately many have no sensible Prolog counterpart.
  857map_error_code(401, permission_error).
  858map_error_code(403, permission_error).
  859map_error_code(404, existence_error).
  860map_error_code(405, permission_error).
  861map_error_code(407, permission_error).
  862map_error_code(410, existence_error).
  863
  864redirect_code(301).                     % Moved Permanently
  865redirect_code(302).                     % Found (previously "Moved Temporary")
  866redirect_code(303).                     % See Other
  867redirect_code(307).                     % Temporary Redirect
  868
  869authenticate_code(401).
 open_socket(+Address, -StreamPair, +Options) is det
Create and connect a client socket to Address. Options
timeout(+Timeout)
Sets timeout on the stream, after connecting the socket.
To be done
- Make timeout also work on tcp_connect/4.
- This is the same as do_connect/4 in http_client.pl
  882open_socket(Address, StreamPair, Options) :-
  883    debug(http(open), 'http_open: Connecting to ~p ...', [Address]),
  884    tcp_connect(Address, StreamPair, Options),
  885    stream_pair(StreamPair, In, Out),
  886    debug(http(open), '\tok ~p ---> ~p', [In, Out]),
  887    set_stream(In, record_position(false)),
  888    (   option(timeout(Timeout), Options)
  889    ->  set_stream(In, timeout(Timeout))
  890    ;   true
  891    ).
  892
  893
  894return_version(Options, Major-Minor) :-
  895    option(version(Major-Minor), Options, _).
  896
  897return_size(Options, Headers) :-
  898    (   memberchk(content_length(Size), Headers)
  899    ->  option(size(Size), Options, _)
  900    ;   true
  901    ).
  902
  903return_fields([], _).
  904return_fields([header(Name, Value)|T], Headers) :-
  905    !,
  906    (   Term =.. [Name,Value],
  907        memberchk(Term, Headers)
  908    ->  true
  909    ;   Value = ''
  910    ),
  911    return_fields(T, Headers).
  912return_fields([_|T], Lines) :-
  913    return_fields(T, Lines).
  914
  915return_headers(Options, Headers) :-
  916    option(headers(Headers), Options, _).
 parse_headers(+Lines, -Headers:list(compound)) is det
Parse the header lines for the headers(-List) option. Invalid header lines are skipped, printing a warning using pring_message/2.
  924parse_headers([], []) :- !.
  925parse_headers([Line|Lines], Headers) :-
  926    catch(http_parse_header(Line, [Header]), Error, true),
  927    (   var(Error)
  928    ->  Headers = [Header|More]
  929    ;   print_message(warning, Error),
  930        Headers = More
  931    ),
  932    parse_headers(Lines, More).
 return_final_url(+Options) is semidet
If Options contains final_url(URL), unify URL with the final URL after redirections.
  940return_final_url(Options) :-
  941    option(final_url(URL), Options),
  942    var(URL),
  943    !,
  944    option(visited([Parts|_]), Options),
  945    parts_uri(Parts, URL).
  946return_final_url(_).
 transfer_encoding_filter(+Lines, +In0, -In) is det
Install filters depending on the transfer encoding. If In0 is a stream-pair, we close the output side. If transfer-encoding is not specified, the content-encoding is interpreted as a synonym for transfer-encoding, because many servers incorrectly depend on this. Exceptions to this are content-types for which disable_encoding_filter/1 holds.
  958transfer_encoding_filter(Lines, In0, In) :-
  959    transfer_encoding(Lines, Encoding),
  960    !,
  961    transfer_encoding_filter_(Encoding, In0, In).
  962transfer_encoding_filter(Lines, In0, In) :-
  963    content_encoding(Lines, Encoding),
  964    content_type(Lines, Type),
  965    \+ http:disable_encoding_filter(Type),
  966    !,
  967    transfer_encoding_filter_(Encoding, In0, In).
  968transfer_encoding_filter(_, In, In).
  969
  970transfer_encoding_filter_(Encoding, In0, In) :-
  971    stream_pair(In0, In1, Out),
  972    (   nonvar(Out)
  973    ->  close(Out)
  974    ;   true
  975    ),
  976    (   http:encoding_filter(Encoding, In1, In)
  977    ->  true
  978    ;   autoload_encoding(Encoding),
  979        http:encoding_filter(Encoding, In1, In)
  980    ->  true
  981    ;   domain_error(http_encoding, Encoding)
  982    ).
  983
  984:- multifile
  985    autoload_encoding/1.  986
  987:- if(exists_source(library(zlib))).  988autoload_encoding(gzip) :-
  989    use_module(library(zlib)).
  990:- endif.  991
  992content_type(Lines, Type) :-
  993    member(Line, Lines),
  994    phrase(field('content-type'), Line, Rest),
  995    !,
  996    atom_codes(Type, Rest).
 http:disable_encoding_filter(+ContentType) is semidet
Do not use the Content-encoding as Transfer-encoding encoding for specific values of ContentType. This predicate is multifile and can thus be extended by the user.
 1004http:disable_encoding_filter('application/x-gzip').
 1005http:disable_encoding_filter('application/x-tar').
 1006http:disable_encoding_filter('x-world/x-vrml').
 1007http:disable_encoding_filter('application/zip').
 1008http:disable_encoding_filter('application/x-gzip').
 1009http:disable_encoding_filter('application/x-zip-compressed').
 1010http:disable_encoding_filter('application/x-compress').
 1011http:disable_encoding_filter('application/x-compressed').
 1012http:disable_encoding_filter('application/x-spoon').
 transfer_encoding(+Lines, -Encoding) is semidet
True if Encoding is the value of the Transfer-encoding header.
 1019transfer_encoding(Lines, Encoding) :-
 1020    what_encoding(transfer_encoding, Lines, Encoding).
 1021
 1022what_encoding(What, Lines, Encoding) :-
 1023    member(Line, Lines),
 1024    phrase(encoding_(What, Debug), Line, Rest),
 1025    !,
 1026    atom_codes(Encoding, Rest),
 1027    debug(http(What), '~w: ~p', [Debug, Rest]).
 1028
 1029encoding_(content_encoding, 'Content-encoding') -->
 1030    field('content-encoding').
 1031encoding_(transfer_encoding, 'Transfer-encoding') -->
 1032    field('transfer-encoding').
 content_encoding(+Lines, -Encoding) is semidet
True if Encoding is the value of the Content-encoding header.
 1039content_encoding(Lines, Encoding) :-
 1040    what_encoding(content_encoding, Lines, Encoding).
 read_header(+In:stream, +Parts, -Version, -Code:int, -Comment:atom, -Lines:list) is det
Read the HTTP reply-header. If the reply is completely empty an existence error is thrown. If the replied header is otherwise invalid a 500 HTTP error is simulated, having the comment Invalid reply header.
Arguments:
Parts- A list of compound terms that describe the parsed request URI.
Version- HTTP reply version as Major-Minor pair
Code- Numeric HTTP reply-code
Comment- Comment of reply-code as atom
Lines- Remaining header lines as code-lists.
Errors
- existence_error(http_reply, Uri)
 1059read_header(In, Parts, Major-Minor, Code, Comment, Lines) :-
 1060    read_line_to_codes(In, Line),
 1061    (   Line == end_of_file
 1062    ->  parts_uri(Parts, Uri),
 1063        existence_error(http_reply,Uri)
 1064    ;   true
 1065    ),
 1066    Line \== end_of_file,
 1067    phrase(first_line(Major-Minor, Code, Comment), Line),
 1068    debug(http(open), 'HTTP/~d.~d ~w ~w', [Major, Minor, Code, Comment]),
 1069    read_line_to_codes(In, Line2),
 1070    rest_header(Line2, In, Lines),
 1071    !,
 1072    (   debugging(http(open))
 1073    ->  forall(member(HL, Lines),
 1074               debug(http(open), '~s', [HL]))
 1075    ;   true
 1076    ).
 1077read_header(_, _, 1-1, 500, 'Invalid reply header', []).
 1078
 1079rest_header([], _, []) :- !.            % blank line: end of header
 1080rest_header(L0, In, [L0|L]) :-
 1081    read_line_to_codes(In, L1),
 1082    rest_header(L1, In, L).
 content_length(+Header, -Length:int) is semidet
Find the Content-Length in an HTTP reply-header.
 1088content_length(Lines, Length) :-
 1089    member(Line, Lines),
 1090    phrase(content_length(Length0), Line),
 1091    !,
 1092    Length = Length0.
 1093
 1094location(Lines, RequestURI) :-
 1095    member(Line, Lines),
 1096    phrase(atom_field(location, RequestURI), Line),
 1097    !.
 1098
 1099connection(Lines, Connection) :-
 1100    member(Line, Lines),
 1101    phrase(atom_field(connection, Connection0), Line),
 1102    !,
 1103    Connection = Connection0.
 1104
 1105first_line(Major-Minor, Code, Comment) -->
 1106    "HTTP/", integer(Major), ".", integer(Minor),
 1107    skip_blanks,
 1108    integer(Code),
 1109    skip_blanks,
 1110    rest(Comment).
 1111
 1112atom_field(Name, Value) -->
 1113    field(Name),
 1114    rest(Value).
 1115
 1116content_length(Len) -->
 1117    field('content-length'),
 1118    integer(Len).
 1119
 1120field(Name) -->
 1121    { atom_codes(Name, Codes) },
 1122    field_codes(Codes).
 1123
 1124field_codes([]) -->
 1125    ":",
 1126    skip_blanks.
 1127field_codes([H|T]) -->
 1128    [C],
 1129    { match_header_char(H, C)
 1130    },
 1131    field_codes(T).
 1132
 1133match_header_char(C, C) :- !.
 1134match_header_char(C, U) :-
 1135    code_type(C, to_lower(U)),
 1136    !.
 1137match_header_char(0'_, 0'-).
 1138
 1139
 1140skip_blanks -->
 1141    [C],
 1142    { code_type(C, white)
 1143    },
 1144    !,
 1145    skip_blanks.
 1146skip_blanks -->
 1147    [].
 integer(-Int)//
Read 1 or more digits and return as integer.
 1153integer(Code) -->
 1154    digit(D0),
 1155    digits(D),
 1156    { number_codes(Code, [D0|D])
 1157    }.
 1158
 1159digit(C) -->
 1160    [C],
 1161    { code_type(C, digit)
 1162    }.
 1163
 1164digits([D0|D]) -->
 1165    digit(D0),
 1166    !,
 1167    digits(D).
 1168digits([]) -->
 1169    [].
 rest(-Atom:atom)//
Get rest of input as an atom.
 1175rest(Atom) --> call(rest_(Atom)).
 1176
 1177rest_(Atom, L, []) :-
 1178    atom_codes(Atom, L).
 1179
 1180
 1181                 /*******************************
 1182                 *   AUTHORIZATION MANAGEMENT   *
 1183                 *******************************/
 http_set_authorization(+URL, +Authorization) is det
Set user/password to supply with URLs that have URL as prefix. If Authorization is the atom -, possibly defined authorization is cleared. For example:
?- http_set_authorization('http://www.example.com/private/',
                          basic('John', 'Secret'))
To be done
- Move to a separate module, so http_get/3, etc. can use this too.
 1199:- dynamic
 1200    stored_authorization/2,
 1201    cached_authorization/2. 1202
 1203http_set_authorization(URL, Authorization) :-
 1204    must_be(atom, URL),
 1205    retractall(stored_authorization(URL, _)),
 1206    (   Authorization = (-)
 1207    ->  true
 1208    ;   check_authorization(Authorization),
 1209        assert(stored_authorization(URL, Authorization))
 1210    ),
 1211    retractall(cached_authorization(_,_)).
 1212
 1213check_authorization(Var) :-
 1214    var(Var),
 1215    !,
 1216    instantiation_error(Var).
 1217check_authorization(basic(User, Password)) :-
 1218    must_be(atom, User),
 1219    must_be(text, Password).
 1220check_authorization(digest(User, Password)) :-
 1221    must_be(atom, User),
 1222    must_be(text, Password).
 authorization(+URL, -Authorization) is semidet
True if Authorization must be supplied for URL.
To be done
- Cleanup cache if it gets too big.
 1230authorization(_, _) :-
 1231    \+ stored_authorization(_, _),
 1232    !,
 1233    fail.
 1234authorization(URL, Authorization) :-
 1235    cached_authorization(URL, Authorization),
 1236    !,
 1237    Authorization \== (-).
 1238authorization(URL, Authorization) :-
 1239    (   stored_authorization(Prefix, Authorization),
 1240        sub_atom(URL, 0, _, _, Prefix)
 1241    ->  assert(cached_authorization(URL, Authorization))
 1242    ;   assert(cached_authorization(URL, -)),
 1243        fail
 1244    ).
 1245
 1246add_authorization(_, Options, Options) :-
 1247    option(authorization(_), Options),
 1248    !.
 1249add_authorization(Parts, Options0, Options) :-
 1250    url_part(user(User), Parts),
 1251    url_part(password(Passwd), Parts),
 1252    !,
 1253    Options = [authorization(basic(User,Passwd))|Options0].
 1254add_authorization(Parts, Options0, Options) :-
 1255    stored_authorization(_, _) ->   % quick test to avoid work
 1256    parts_uri(Parts, URL),
 1257    authorization(URL, Auth),
 1258    !,
 1259    Options = [authorization(Auth)|Options0].
 1260add_authorization(_, Options, Options).
 parse_url_ex(+URL, -Parts)
Parts: Scheme, Host, Port, User:Password, RequestURI (no fragment).
 1268parse_url_ex(URL, [uri(URL)|Parts]) :-
 1269    uri_components(URL, Components),
 1270    phrase(components(Components), Parts),
 1271    (   option(host(_), Parts)
 1272    ->  true
 1273    ;   domain_error(url, URL)
 1274    ).
 1275
 1276components(Components) -->
 1277    uri_scheme(Components),
 1278    uri_path(Components),
 1279    uri_authority(Components),
 1280    uri_request_uri(Components).
 1281
 1282uri_scheme(Components) -->
 1283    { uri_data(scheme, Components, Scheme), nonvar(Scheme) },
 1284    !,
 1285    [ scheme(Scheme)
 1286    ].
 1287uri_scheme(_) --> [].
 1288
 1289uri_path(Components) -->
 1290    { uri_data(path, Components, Path0), nonvar(Path0),
 1291      (   Path0 == ''
 1292      ->  Path = (/)
 1293      ;   Path = Path0
 1294      )
 1295    },
 1296    !,
 1297    [ path(Path)
 1298    ].
 1299uri_path(_) --> [].
 1300
 1301uri_authority(Components) -->
 1302    { uri_data(authority, Components, Auth), nonvar(Auth),
 1303      !,
 1304      uri_authority_components(Auth, Data)
 1305    },
 1306    [ authority(Auth) ],
 1307    auth_field(user, Data),
 1308    auth_field(password, Data),
 1309    auth_field(host, Data),
 1310    auth_field(port, Data).
 1311uri_authority(_) --> [].
 1312
 1313auth_field(Field, Data) -->
 1314    { uri_authority_data(Field, Data, EncValue), nonvar(EncValue),
 1315      !,
 1316      (   atom(EncValue)
 1317      ->  uri_encoded(query_value, Value, EncValue)
 1318      ;   Value = EncValue
 1319      ),
 1320      Part =.. [Field,Value]
 1321    },
 1322    [ Part ].
 1323auth_field(_, _) --> [].
 1324
 1325uri_request_uri(Components) -->
 1326    { uri_data(path, Components, Path0),
 1327      uri_data(search, Components, Search),
 1328      (   Path0 == ''
 1329      ->  Path = (/)
 1330      ;   Path = Path0
 1331      ),
 1332      uri_data(path, Components2, Path),
 1333      uri_data(search, Components2, Search),
 1334      uri_components(RequestURI, Components2)
 1335    },
 1336    [ request_uri(RequestURI)
 1337    ].
 parts_scheme(+Parts, -Scheme) is det
 parts_uri(+Parts, -URI) is det
 parts_request_uri(+Parts, -RequestURI) is det
 parts_search(+Parts, -Search) is det
 parts_authority(+Parts, -Authority) is semidet
 1345parts_scheme(Parts, Scheme) :-
 1346    url_part(scheme(Scheme), Parts),
 1347    !.
 1348parts_scheme(Parts, Scheme) :-          % compatibility with library(url)
 1349    url_part(protocol(Scheme), Parts),
 1350    !.
 1351parts_scheme(_, http).
 1352
 1353parts_authority(Parts, Auth) :-
 1354    url_part(authority(Auth), Parts),
 1355    !.
 1356parts_authority(Parts, Auth) :-
 1357    url_part(host(Host), Parts, _),
 1358    url_part(port(Port), Parts, _),
 1359    url_part(user(User), Parts, _),
 1360    url_part(password(Password), Parts, _),
 1361    uri_authority_components(Auth,
 1362                             uri_authority(User, Password, Host, Port)).
 1363
 1364parts_request_uri(Parts, RequestURI) :-
 1365    option(request_uri(RequestURI), Parts),
 1366    !.
 1367parts_request_uri(Parts, RequestURI) :-
 1368    url_part(path(Path), Parts, /),
 1369    ignore(parts_search(Parts, Search)),
 1370    uri_data(path, Data, Path),
 1371    uri_data(search, Data, Search),
 1372    uri_components(RequestURI, Data).
 1373
 1374parts_search(Parts, Search) :-
 1375    option(query_string(Search), Parts),
 1376    !.
 1377parts_search(Parts, Search) :-
 1378    option(search(Fields), Parts),
 1379    !,
 1380    uri_query_components(Search, Fields).
 1381
 1382
 1383parts_uri(Parts, URI) :-
 1384    option(uri(URI), Parts),
 1385    !.
 1386parts_uri(Parts, URI) :-
 1387    parts_scheme(Parts, Scheme),
 1388    ignore(parts_authority(Parts, Auth)),
 1389    parts_request_uri(Parts, RequestURI),
 1390    uri_components(RequestURI, Data),
 1391    uri_data(scheme, Data, Scheme),
 1392    uri_data(authority, Data, Auth),
 1393    uri_components(URI, Data).
 1394
 1395parts_port(Parts, Port) :-
 1396    parts_scheme(Parts, Scheme),
 1397    default_port(Scheme, DefPort),
 1398    url_part(port(Port), Parts, DefPort).
 1399
 1400url_part(Part, Parts) :-
 1401    Part =.. [Name,Value],
 1402    Gen =.. [Name,RawValue],
 1403    option(Gen, Parts),
 1404    !,
 1405    Value = RawValue.
 1406
 1407url_part(Part, Parts, Default) :-
 1408    Part =.. [Name,Value],
 1409    Gen =.. [Name,RawValue],
 1410    (   option(Gen, Parts)
 1411    ->  Value = RawValue
 1412    ;   Value = Default
 1413    ).
 1414
 1415
 1416                 /*******************************
 1417                 *            COOKIES           *
 1418                 *******************************/
 1419
 1420write_cookies(Out, Parts, Options) :-
 1421    http:write_cookies(Out, Parts, Options),
 1422    !.
 1423write_cookies(_, _, _).
 1424
 1425update_cookies(_, _, _) :-
 1426    predicate_property(http:update_cookies(_,_,_), number_of_clauses(0)),
 1427    !.
 1428update_cookies(Lines, Parts, Options) :-
 1429    (   member(Line, Lines),
 1430        phrase(atom_field('set_cookie', CookieData), Line),
 1431        http:update_cookies(CookieData, Parts, Options),
 1432        fail
 1433    ;   true
 1434    ).
 1435
 1436
 1437                 /*******************************
 1438                 *           OPEN ANY           *
 1439                 *******************************/
 1440
 1441:- multifile iostream:open_hook/6.
 iostream:open_hook(+Spec, +Mode, -Stream, -Close, +Options0, -Options) is semidet
Hook implementation that makes open_any/5 support http and https URLs for Mode == read.
 1449iostream:open_hook(URL, read, Stream, Close, Options0, Options) :-
 1450    (atom(URL) -> true ; string(URL)),
 1451    uri_is_global(URL),
 1452    uri_components(URL, Components),
 1453    uri_data(scheme, Components, Scheme),
 1454    http_scheme(Scheme),
 1455    !,
 1456    Options = Options0,
 1457    Close = close(Stream),
 1458    http_open(URL, Stream, Options0).
 1459
 1460http_scheme(http).
 1461http_scheme(https).
 1462
 1463
 1464                 /*******************************
 1465                 *          KEEP-ALIVE          *
 1466                 *******************************/
 consider_keep_alive(+HeaderLines, +Parts, +Host, +Stream0, -Stream, +Options) is det
 1472consider_keep_alive(Lines, Parts, Host, StreamPair, In, Options) :-
 1473    option(connection(Asked), Options),
 1474    keep_alive(Asked),
 1475    connection(Lines, Given),
 1476    keep_alive(Given),
 1477    content_length(Lines, Bytes),
 1478    !,
 1479    stream_pair(StreamPair, In0, _),
 1480    connection_address(Host, Parts, HostPort),
 1481    debug(http(connection),
 1482          'Keep-alive to ~w (~D bytes)', [HostPort, Bytes]),
 1483    stream_range_open(In0, In,
 1484                      [ size(Bytes),
 1485                        onclose(keep_alive(StreamPair, HostPort))
 1486                      ]).
 1487consider_keep_alive(_, _, _, Stream, Stream, _).
 1488
 1489connection_address(Host, _, Host) :-
 1490    Host = _:_,
 1491    !.
 1492connection_address(Host, Parts, Host:Port) :-
 1493    parts_port(Parts, Port).
 1494
 1495keep_alive(keep_alive) :- !.
 1496keep_alive(Connection) :-
 1497    downcase_atom(Connection, 'keep-alive').
 1498
 1499:- public keep_alive/4. 1500
 1501keep_alive(StreamPair, Host, _In, 0) :-
 1502    !,
 1503    debug(http(connection), 'Adding connection to ~p to pool', [Host]),
 1504    add_to_pool(Host, StreamPair).
 1505keep_alive(StreamPair, Host, In, Left) :-
 1506    Left < 100,
 1507    debug(http(connection), 'Reading ~D left bytes', [Left]),
 1508    read_incomplete(In, Left),
 1509    add_to_pool(Host, StreamPair),
 1510    !.
 1511keep_alive(StreamPair, _, _, _) :-
 1512    debug(http(connection),
 1513          'Closing connection due to excessive unprocessed input', []),
 1514    (   debugging(http(connection))
 1515    ->  catch(close(StreamPair), E,
 1516              print_message(warning, E))
 1517    ;   close(StreamPair, [force(true)])
 1518    ).
 read_incomplete(+In, +Left) is semidet
If we have not all input from a Keep-alive connection, read the remainder if it is short. Else, we fail and close the stream.
 1525read_incomplete(In, Left) :-
 1526    catch(setup_call_cleanup(
 1527              open_null_stream(Null),
 1528              copy_stream_data(In, Null, Left),
 1529              close(Null)),
 1530          _,
 1531          fail).
 1532
 1533:- dynamic
 1534    connection_pool/4,              % Hash, Address, Stream, Time
 1535    connection_gc_time/1. 1536
 1537add_to_pool(Address, StreamPair) :-
 1538    keep_connection(Address),
 1539    get_time(Now),
 1540    term_hash(Address, Hash),
 1541    assertz(connection_pool(Hash, Address, StreamPair, Now)).
 1542
 1543get_from_pool(Address, StreamPair) :-
 1544    term_hash(Address, Hash),
 1545    retract(connection_pool(Hash, Address, StreamPair, _)).
 keep_connection(+Address) is semidet
Succeeds if we want to keep the connection open. We currently keep a maximum of 10 connections waiting and a maximum of 2 waiting for the same address. Connections older than 2 seconds are closed.
 1554keep_connection(Address) :-
 1555    close_old_connections(2),
 1556    predicate_property(connection_pool(_,_,_,_), number_of_clauses(C)),
 1557    C =< 10,
 1558    term_hash(Address, Hash),
 1559    aggregate_all(count, connection_pool(Hash, Address, _, _), Count),
 1560    Count =< 2.
 1561
 1562close_old_connections(Timeout) :-
 1563    get_time(Now),
 1564    Before is Now - Timeout,
 1565    (   connection_gc_time(GC),
 1566        GC > Before
 1567    ->  true
 1568    ;   (   retractall(connection_gc_time(_)),
 1569            asserta(connection_gc_time(Now)),
 1570            connection_pool(Hash, Address, StreamPair, Added),
 1571            Added < Before,
 1572            retract(connection_pool(Hash, Address, StreamPair, Added)),
 1573            debug(http(connection),
 1574                  'Closing inactive keep-alive to ~p', [Address]),
 1575            close(StreamPair, [force(true)]),
 1576            fail
 1577        ;   true
 1578        )
 1579    ).
 http_close_keep_alive(+Address) is det
Close all keep-alive connections matching Address. Address is of the form Host:Port. In particular, http_close_keep_alive(_) closes all currently known keep-alive connections.
 1588http_close_keep_alive(Address) :-
 1589    forall(get_from_pool(Address, StreamPair),
 1590           close(StreamPair, [force(true)])).
 keep_alive_error(+Error)
Deal with an error from reusing a keep-alive connection. If the error is due to an I/O error or end-of-file, fail to backtrack over get_from_pool/2. Otherwise it is a real error and we thus re-raise it.
 1599keep_alive_error(keep_alive(closed)) :-
 1600    !,
 1601    debug(http(connection), 'Keep-alive connection was closed', []),
 1602    fail.
 1603keep_alive_error(io_error(_,_)) :-
 1604    !,
 1605    debug(http(connection), 'IO error on Keep-alive connection', []),
 1606    fail.
 1607keep_alive_error(Error) :-
 1608    throw(Error).
 1609
 1610
 1611                 /*******************************
 1612                 *     HOOK DOCUMENTATION       *
 1613                 *******************************/
 http:open_options(+Parts, -Options) is nondet
This hook is used by the HTTP client library to define default options based on the the broken-down request-URL. The following example redirects all trafic, except for localhost over a proxy:
:- multifile
    http:open_options/2.

http:open_options(Parts, Options) :-
    option(host(Host), Parts),
    Host \== localhost,
    Options = [proxy('proxy.local', 3128)].

This hook may return multiple solutions. The returned options are combined using merge_options/3 where earlier solutions overrule later solutions.

 http:write_cookies(+Out, +Parts, +Options) is semidet
Emit a Cookie: header for the current connection. Out is an open stream to the HTTP server, Parts is the broken-down request (see uri_components/2) and Options is the list of options passed to http_open. The predicate is called as if using ignore/1.
See also
- complements http:update_cookies/3.
- library(http/http_cookie) implements cookie handling on top of these hooks.
 http:update_cookies(+CookieData, +Parts, +Options) is semidet
Update the cookie database. CookieData is the value of the Set-Cookie field, Parts is the broken-down request (see uri_components/2) and Options is the list of options passed to http_open.
See also
- complements http:write_cookies
- library(http/http_cookies) implements cookie handling on top of these hooks.