Parse URL in Zend Framework 2

Objective:
    the
  1. Have a method to disassemble the components of the string containing the URL. The string can contain both absolute and relative URLS and both these options should be correctly parsed.
  2. the
  3. And in the drain, let's say the "wrong" format absolute links without "http://". Further in the text links to the "wrong" format will be called partial absolute references.
  4. the
  5. Implement support for the "RF" domain.


the the the the
Example: site.ru/page.php /page.php site.ru/page.php
scheme http
host site.ru site.ru
path page.php page.php page.php

Implementation of paragraphs 1 and 2 of our problem


In this we should help the class Zend\Uri\Http. He has we need methods parse($uri), getHost(), getPath(), etc.
But! When we parse the URL and type "site.ru/page.php" (without the "http://") getHost() will return an empty string and getPath() returns "site.ru/page.php".

Here is my method to achieve the desired. Format absolute incomplete link identical to the link about the source (type a relative reference). To recognize the absolute incomplete you can link TDL testing it (first level domain). If such a domain exists — the link can be considered absolute incomplete.

the
 public function myParse($url){
$Http = new Http($url);
if($Http->isValidRelative()){
// if the url is not parsed as a relative
$path = $Http->getPath();
// if path starts with " / " url cannot be perceived as "wrong" format absolute links
// is the relative
if( $path{0} !== '/' ){ 
// otherwise try to collect the absolute reference...
$absoluteUrl = '//'.urldecode($Http->toString());
$absoluteHttp = new Http($absoluteUrl);
// (1)
$Hostname = new Hostname(array('allow'= > Hostname::ALLOW_DNS, 'useTldCheck'=>false));
$decode = true;
// ... and check for correct host (2)
if ($Hostname->isValid($absoluteHttp->getHost($decode))) {
// if correct host believe that the link absolute "wrong" format
$Http = $absoluteHttp;
}
}
}
return $Http;
}

code Comments

    the
  1. a Custom Zend\Validator\Hostname so that it checked the presence of the domain of the first level links in the array $validIdns
  2. the
  3. Pass to the getHost() parameter $decode = true; to decoded host. The method getHost() of class Zend\Uri\Http does not involve any parameters and that does not decode! What it is and how it works?!.. Read below.

Implementation of paragraph 3 of our problem. IDN Russia and work with him


Unfortunately, ZF2 does not work properly with IDNs, we have to compensate. For this you need to download whichever class encoding and decoding URLs using punycode and extend class Zend\Uri\Http.

the

namespace Application\Other;

use Zend\Uri\Http as ZendHttp;

use Application\Model\IdnaConvert;

class Http extends ZendHttp
{ 
public function setHost($host){
if($host){
$idn = new IdnaConvert();
$host = $idn- > encode($host);
}
return parent::setHost($host);
}
public function getHost($decode=false) {
if($decode && $this->host){
$idn = new IdnaConvert();
return $idn- > decode($this->host);
} 
return parent::getHost();
}
}

Accordingly, our method myParse() must use the extended class Http, which is parsing a URL can encode the RF-domains and when you call the getHost($decode), we will have the opportunity to return Punycode-representation or the decoded representation, depending on the parameter passed to the method.

PS There are concerns as the above, but it is one of the reasons to publish the post to know the opinion of those who are more experienced on the part of ZF2. Another reason — I have no where found a solution for this seemingly obvious task. Can be from you, I learn about other, perhaps simpler and more reliable options.
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Integration of PostgreSQL with MS SQL Server for those who want faster and deeper

2 years Kartavykh reviews — the story of an Amateur show Old-Hard