Distributed System
Manage distributed nodes to enable efficient execution of web crawling programs across multiple servers
Spider Management
Effectively manage web crawling programs without the need for manual uploading, monitoring, and deployment; everything is automated
Task Scheduling
Easily schedule web crawling tasks and view real-time status and task logs of web crawling programs
Cron-style Scheduler
Schedule web crawling tasks to automatically run at specified times using the Crontab-style scheduler
File Editing
Edit web crawling code online with support for syntax highlighting in popular programming languages, making it easy to debug web crawling programs
Notification Alerts
Receive notifications about the status of web crawling through email, DingTalk, WeChat Work, and other methods
Dependency Management
Built-in environment dependency management to easily install third-party libraries required for web crawling through the interface
Git Integration
Pull web crawling project code directly from Git repositories and view historical versions
Performance Monitoring
Real-time monitoring of performance metrics such as CPU, memory, disk, and network in the execution environment
Database Integration
Seamless integration with databases such as MySQL, MongoDB, Elasticsearch, and more
Access Control
Support user, role, and permission management to effectively control access to web crawling programs
More New Features
Includes project management, API, SDK, CLI, and web crawling canvas, among others