public class RobotsTXT extends Page
/robots.txt
files and provides the
respective test function. The format was summarized and suggested by M.
Koster, and the original file may be accessed here . The format and
semantics of the "/robots.txt" file are as follows: The file consists of one
or more records separated by one or more blank lines (terminated by CR,CR/NL,
or NL). Each record contains lines of the form "<field>: <optionalspace><value><optionalspace>".
The field name is case insensitive. Comments can be included in file using
UNIX bourne shell conventions: the '#' character is used to indicate that
preceding space (if any) and the remainder of the line up to the line
termination is discarded. Lines containing only a comment are discarded
completely, and therefore do not indicate a record boundary. The record
starts with one or more User-agent lines, followed by one or more Disallow
lines, as detailed below. Unrecognised headers are ignored.
Modifier and Type | Field and Description |
---|---|
long |
nextConn |
Modifier and Type | Method and Description |
---|---|
static RobotsTXT |
EMPTY() |
boolean |
equalTo(RobotsTXT r) |
static RobotsTXT |
FULL() |
int |
hashCode() |
void |
initialize(long time,
String name,
int rank)
Sets this object as a fresh and new with lastModif=0 and nextConn=time.
|
void |
load(DataInput in) |
static RobotsTXT |
parse(BufferedReader r)
Analyses a robots.txt file.
|
void |
print() |
void |
store(DataOutput out)
Stores this structure to a stream.
|
String |
toString() |
boolean |
valid(String uri)
Tests whether the given URL fulfils the rules.
|
dropped, isAlreadyProcessed, toStringBuilder
public RobotsTXT(DataInput in) throws IOException
IOException
public static RobotsTXT EMPTY()
public static RobotsTXT FULL()
public void print()
public boolean equalTo(RobotsTXT r)
public static RobotsTXT parse(BufferedReader r)
r
- The filepublic void store(DataOutput out) throws IOException
store
in class Page
out
- The streamIOException
- When I/O failspublic void load(DataInput in) throws IOException
load
in class Page
IOException
public boolean valid(String uri)
uri
- The URLpublic void initialize(long time, String name, int rank)
Copyright © 2016 Egothor. All Rights Reserved.