Uname: Linux premium264.web-hosting.com 4.18.0-553.lve.el8.x86_64 #1 SMP Mon May 27 15:27:34 UTC 2024 x86_64
Software: LiteSpeed
PHP version: 8.3.22 [ PHP INFO ] PHP os: Linux
Server Ip: 69.57.162.13
Your Ip: 216.73.216.219
User: workvvfb (1129) | Group: workvvfb (1084)
Safe Mode: OFF
Disable Function:
NONE

name : collector.cpython-38.pyc
U

��.eWF�@s dZddlZddlZddlZddlZddlZddlmZddlm	Z	m
Z
ddlmZddl
mZmZmZddlmZddlmZdd	lmZdd
lmZddlmZddlmZdd
lmZm Z ddl!m"Z"m#Z#e�rDddl$m%Z%m&Z&m'Z'm(Z(m)Z)m*Z*m+Z+m,Z,m-Z-ddl.Z/ddl0m1Z1ddl2m3Z3ddl4m5Z5e/j6j7j8Z9e)e:e:fZ;e�<e=�Z>dd�Z?dd�Z@Gdd�deA�ZBdd�ZCGdd�deA�ZDdd�ZEdd �ZFd!d"�ZGd#d$�ZHd%d&�ZId'd(�ZJd)d*�ZKGd+d,�d,eL�ZMd<d-d.�ZNd/d0�ZOd=d1d2�ZPd3d4�ZQd>d6d7�ZRGd8d9�d9eL�ZSGd:d;�d;eL�ZTdS)?zM
The main purpose of this module is to expose LinkCollector.collect_links().
�N)�OrderedDict)�html5lib�requests)�unescape)�	HTTPError�
RetryError�SSLError)�parse)�request��Link)�ARCHIVE_EXTENSIONS)�redact_auth_from_url)�MYPY_CHECK_RUNNING)�path_to_url�url_to_path)�is_url�vcs)	�Callable�Dict�Iterable�List�MutableMapping�Optional�Sequence�Tuple�Union)�Response)�SearchScope)�
PipSessioncCs6tjD]*}|���|�r|t|�dkr|SqdS)zgLook for VCS schemes in the URL.

    Returns the matched VCS scheme, or None if there's no match.
    z+:N)rZschemes�lower�
startswith�len)�url�scheme�r%�;/usr/lib/python3.8/site-packages/pip/_internal/collector.py�_match_vcs_scheme/s

r'cCs(t|�j}tD]}|�|�rdSqdS)z2Return whether the URL looks like an archive.
    TF)r�filenamer
�endswith)r#r(Zbad_extr%r%r&�_is_url_like_archive;s


r*cseZdZ�fdd�Z�ZS)�_NotHTMLcs"tt|��||�||_||_dS�N)�superr+�__init__�content_type�request_desc)�selfr/r0��	__class__r%r&r.Gsz_NotHTML.__init__)�__name__�
__module__�__qualname__r.�
__classcell__r%r%r2r&r+Fsr+cCs.|j�dd�}|���d�s*t||jj��dS)z�Check the Content-Type header to ensure the response contains HTML.

    Raises `_NotHTML` if the content type is not text/html.
    �Content-Type��	text/htmlN)�headers�getr r!r+r
�method)�responser/r%r%r&�_ensure_html_headerNsr?c@seZdZdS)�_NotHTTPN)r4r5r6r%r%r%r&r@Ysr@cCsDt�|�\}}}}}|dkr"t��|j|dd�}|��t|�dS)z�Send a HEAD request to the URL, and ensure the response contains HTML.

    Raises `_NotHTTP` if the URL is not available for a HEAD request, or
    `_NotHTML` if the content type is not text/html.
    >�http�httpsT)Zallow_redirectsN)�urllib_parseZurlsplitr@�head�raise_for_statusr?)r#�sessionr$�netloc�pathZqueryZfragment�respr%r%r&�_ensure_html_response]srJcCsLt|�rt||d�t�dt|��|j|ddd�d�}|��t|�|S)aAccess an HTML page with GET, and return the response.

    This consists of three parts:

    1. If the URL looks suspiciously like an archive, send a HEAD first to
       check the Content-Type is HTML, to avoid downloading a large file.
       Raise `_NotHTTP` if the content type cannot be determined, or
       `_NotHTML` if it is not HTML.
    2. Actually perform the request. Raise HTTP exceptions on network failures.
    3. Check the Content-Type header to make sure we got HTML, and raise
       `_NotHTML` otherwise.
    �rFzGetting page %sr:z	max-age=0)ZAcceptz
Cache-Control)r;)r*rJ�logger�debugrr<rEr?)r#rFrIr%r%r&�_get_html_responsens��rNcCs2|r.d|kr.t�|d�\}}d|kr.|dSdS)zBDetermine if we have any encoding information in our headers.
    r8�charsetN)�cgiZparse_header)r;r/Zparamsr%r%r&�_get_encoding_from_headers�s
rQcCs.|�d�D]}|�d�}|dk	r
|Sq
|S)a�Determine the HTML document's base URL.

    This looks for a ``<base>`` tag in the HTML document. If present, its href
    attribute denotes the base URL of anchor tags in the document. If there is
    no such tag (or if it does not have a valid href attribute), the HTML
    file's URL is used as the base URL.

    :param document: An HTML document representation. The current
        implementation expects the result of ``html5lib.parse()``.
    :param page_url: The URL of the HTML document.
    z.//base�hrefN)�findallr<)�document�page_url�baserRr%r%r&�_determine_base_url�s



rWcCsPt�|�}|jdkr(t�t�|j��}ntjt�|j�dd�}t�	|j
|d��S)z�Makes sure a link is fully encoded.  That is, if a ' ' shows up in
    the link, it will be rewritten to %20 (while not over-quoting
    % or other characters).r9z/@)Zsafe)rH)rC�urlparserG�urllib_requestZpathname2url�url2pathnamerHZquoteZunquoteZ
urlunparse�_replace)r#�resultrHr%r%r&�_clean_link�s	


�r]cCsf|�d�}|sdStt�||��}|�d�}|r8t|�nd}|�d�}|rRt|�}t||||d�}|S)zJ
    Convert an anchor element in a simple repository page to a Link.
    rRNzdata-requires-pythonzdata-yanked)Z
comes_fromZrequires_python�
yanked_reason)r<r]rC�urljoinrr)�anchorrU�base_urlrRr#Z	pyrequirer^�linkr%r%r&�_create_link_from_element�s 	


�rcccsVtj|j|jdd�}|j}t||�}|�d�D]"}t|||d�}|dkrJq.|Vq.dS)zP
    Parse an HTML document, and yield its anchor elements as Link objects.
    F)Ztransport_encodingZnamespaceHTMLElementsz.//a)rUraN)rr	�content�encodingr#rWrSrc)�pagerTr#rar`rbr%r%r&�parse_links�s �
�rgc@s eZdZdZdd�Zdd�ZdS)�HTMLPagez'Represents one page, along with its URLcCs||_||_||_dS)z�
        :param encoding: the encoding to decode the given content.
        :param url: the URL from which the HTML was downloaded.
        N)rdrer#)r1rdrer#r%r%r&r.szHTMLPage.__init__cCs
t|j�Sr,)rr#�r1r%r%r&�__str__$szHTMLPage.__str__N)r4r5r6�__doc__r.rjr%r%r%r&rhsrhcCs|dkrtj}|d||�dS)Nz%Could not fetch URL %s: %s - skipping)rLrM)rb�reason�methr%r%r&�_handle_get_page_fail(srncCst|j�}t|j||jd�S)N)rer#)rQr;rhrdr#)r>rer%r%r&�_make_html_page3s
roc

Cs�|dkrtd��|j�dd�d}t|�}|r@t�d||�dSt�|�\}}}}}}|dkr�tj	�
t�|��r�|�
d�s�|d7}t�|d�}t�d	|�zt||d
�}W�nDtk
r�t�d|�Y�n,tk
�r}zt�d||j|j�W5d}~XYn�tk
�r0}zt||�W5d}~XYn�tk
�r\}zt||�W5d}~XYn�tk
�r�}z$d
}	|	t|�7}	t||	tjd�W5d}~XYn\tjk
�r�}zt|d|�W5d}~XYn*tjk
�r�t|d�Yn
Xt|�SdS)Nz?_get_html_page() missing 1 required keyword argument: 'session'�#�rzCannot look at %s URL %s�file�/z
index.htmlz# file: URL is directory, getting %srKzQSkipping page %s because it looks like an archive, and cannot be checked by HEAD.z<Skipping page %s because the %s request got Content-Type: %sz4There was a problem confirming the ssl certificate: )rmzconnection error: %sz	timed out)�	TypeErrorr#�splitr'rLrMrCrX�osrH�isdirrYrZr)r_rNr@r+r0r/rrnrr�str�infor�ConnectionErrorZTimeoutro)
rbrFr#Z
vcs_schemer$�_rHrI�excrlr%r%r&�_get_html_page9sV�
�
� r}cCstt�|��S)zQ
    Return a list of links, with duplicates removed and ordering preserved.
    )�listr�fromkeys)Zlinksr%r%r&�_remove_duplicate_linksosr�Fcs�g�g���fdd�}|D]�}tj�|�}|�d�}|s<|r�|rF|}nt|�}tj�|�r�|r�tj�|�}t�|�D]}|tj�||��qtq�|r���	|�q�t
�d�|��q�tj�
|�r�||�q�t
�d|�qt|�r��	|�qt
�d|�q��fS)z�
    Divide a list of locations into two groups: "files" (archives) and "urls."

    :return: A pair of lists (files, urls).
    cs8t|�}tj|dd�ddkr*��|�n
��|�dS)NF)�strictrr:)r�	mimetypesZ
guess_type�append)rHr#��filesZurlsr%r&�	sort_path�sz"group_locations.<locals>.sort_pathzfile:z)Path '{0}' is ignored: it is a directory.z:Url '%s' is ignored: it is neither a file nor a directory.zQUrl '%s' is ignored. It is either a non-existing path or lacks a specific scheme.)rvrH�existsr!rrw�realpath�listdir�joinr�rLZwarning�format�isfiler)�	locations�
expand_dirr�r#Z
is_local_pathZis_file_urlrH�itemr%r�r&�group_locationsxsF
��
��r�c@seZdZdZdd�ZdS)�CollectedLinksa
    Encapsulates all the Link objects collected by a call to
    LinkCollector.collect_links(), stored separately as--

    (1) links from the configured file locations,
    (2) links from the configured find_links, and
    (3) a dict mapping HTML page url to links from that page.
    cCs||_||_||_dS)z�
        :param files: Links from file locations.
        :param find_links: Links from find_links.
        :param pages: A dict mapping HTML page url to links from that page.
        N�r��
find_links�pages)r1r�r�r�r%r%r&r.�szCollectedLinks.__init__N)r4r5r6rkr.r%r%r%r&r��s	r�c@s4eZdZdZdd�Zedd��Zdd�Zdd	�Zd
S)�
LinkCollectorz�
    Responsible for collecting Link objects from all configured locations,
    making network requests as needed.

    The class's main method is its collect_links() method.
    cCs||_||_dSr,)�search_scoperF)r1rFr�r%r%r&r.�szLinkCollector.__init__cCs|jjSr,)r�r�rir%r%r&r��szLinkCollector.find_linksccs,|D]"}t||jd�}|dkr q|VqdS)zp
        Yields (page, page_url) from the given locations, skipping
        locations that have errors.
        rKN)r}rF)r1r��locationrfr%r%r&�
_get_pages�s
zLinkCollector._get_pagescs��j}|�|�}t|�\}}t�jdd�\}}dd�t�||�D�}dd��jD�}	�fdd�t�dd�|D�d	d�|D��D�}
t|
�}
d
�t|
�|�g}|
D]}|�	d�|��q�t
�d�|��i}
��
|
�D]}tt|��|
|j<q�t||	|
d
�S)z�Find all available links for the given project name.

        :return: All the Link objects (unfiltered), as a CollectedLinks object.
        T)r�cSsg|]}t|��qSr%r��.0r#r%r%r&�
<listcomp>sz/LinkCollector.collect_links.<locals>.<listcomp>cSsg|]}t|d��qS)z-frr�r%r%r&r�scsg|]}�j�|�r|�qSr%)rFZis_secure_origin)r�rbrir%r&r�
s�css|]}t|�VqdSr,rr�r%r%r&�	<genexpr>sz.LinkCollector.collect_links.<locals>.<genexpr>css|]}t|�VqdSr,rr�r%r%r&r�
sz,{} location(s) to search for versions of {}:z* {}�
r�)r�Zget_index_urls_locationsr�r��	itertools�chainr�r�r"r�rLrMr�r�r~rgr#r�)r1Zproject_namer�Zindex_locationsZindex_file_locZ
index_url_locZfl_file_locZ
fl_url_locZ
file_linksZfind_link_linksZ
url_locations�linesrbZpages_linksrfr%rir&�
collect_links�sD
�

�
�����zLinkCollector.collect_linksN)	r4r5r6rkr.�propertyr�r�r�r%r%r%r&r��s	

r�)N)N)F)UrkrPr�Zloggingr�rv�collectionsrZpip._vendorrrZpip._vendor.distlib.compatrZpip._vendor.requests.exceptionsrrrZpip._vendor.six.moves.urllibr	rCr
rYZpip._internal.models.linkrZpip._internal.utils.filetypesr
Zpip._internal.utils.miscrZpip._internal.utils.typingrZpip._internal.utils.urlsrrZpip._internal.vcsrr�typingrrrrrrrrrZxml.etree.ElementTreeZxmlZpip._vendor.requestsrZ!pip._internal.models.search_scoperZpip._internal.network.sessionrZetreeZElementTreeZElementZHTMLElementrxZResponseHeadersZ	getLoggerr4rLr'r*�	Exceptionr+r?r@rJrNrQrWr]rcrg�objectrhrnror}r�r�r�r�r%r%r%r&�<module>s^,

3 �

6	
;
© 2025 GrazzMean