Java中如何在不使用任何外部库的情况下读取网页内容?

2023年 9月 2日 32.9k 0

Java中如何在不使用任何外部库的情况下读取网页内容?

The URL class of the java.net package represents a Uniform Resource Locator which is used to point a resource (file or, directory or a reference) in the world wide web.

The openStream() method of this class opens a connection to the URL represented by the current object and returns an InputStream object using which you can read data from the URL.

Therefore, to read data from web page (using the URL class) −

  • Instantiate the java.net.URL class by passing the URL of the desired web page as a parameter to its constructor.

  • Invoke the openStream() method and retrieve the InputStream object.

  • Instantiate the Scanner class by passing the above retrieved InputStream object as a parameter.

Example

import java.io.IOException;
import java.net.URL;
import java.util.Scanner;
public class ReadingWebPage {
public static void main(String args[]) throws IOException {
//Instantiating the URL class
URL url = new URL("http://www.something.com/");
//Retrieving the contents of the specified page
Scanner sc = new Scanner(url.openStream());
//Instantiating the StringBuffer class to hold the result
StringBuffer sb = new StringBuffer();
while(sc.hasNext()) {
sb.append(sc.next());
//System.out.println(sc.next());
}
//Retrieving the String from the String Buffer object
String result = sb.toString();
System.out.println(result);
//Removing the HTML tags
result = result.replaceAll("]*>", "");
System.out.println("Contents of the web page: "+result);
}
}

登录后复制

输出

Itworks!
Contents of the web page: Itworks!

登录后复制

以上就是Java中如何在不使用任何外部库的情况下读取网页内容?的详细内容,更多请关注每日运维网(www.mryunwei.com)其它相关文章!

相关文章

JavaScript2024新功能:Object.groupBy、正则表达式v标志
PHP trim 函数对多字节字符的使用和限制
新函数 json_validate() 、randomizer 类扩展…20 个PHP 8.3 新特性全面解析
使用HTMX为WordPress增效:如何在不使用复杂框架的情况下增强平台功能
为React 19做准备:WordPress 6.6用户指南
如何删除WordPress中的所有评论

发布评论