Houdu Chukonu

Zhuge Crossbow is a self-developed high-performance big data engine that provides efficient computation for massive data processing scenarios. Under the same hardware environment, it provides faster computing power for multi-modal data such as structured, text, and graph data, thereby reducing user investment in hardware facilities and achieving the effect of reducing costs and increasing efficiency.

View purchase options

Overview

A text corpus cleaning library for large model training, supporting functions such as filtering, cleaning, and deduplication of text corpus data 2. A high-performance programming framework compatible with Spark SQL, PySpark, Pandas and other programming interfaces and unified Data + AI 3. Provides a distributed computing programming platform based on C++ to support the development of high-performance data processing modules 4. It processes massive large-scale graph data and provides a high-performance PageRank isometric calculation algorithm 5. Machine learning library for massive large-scale data sets, providing traditional machine learning algorithms such as K-Means and KR

Highlights

Corpus cleaning for large model training
It is fully compatible with Spark and has a speed ratio of several times to tens of times compared to Apache Spark in common applications.
Supports C++ high-performance module development

Details

Sold by

厚笃科技

Pricing

Houdu Chukonu

View purchase options

This product is available free of charge. Free subscriptions have no end date and may be canceled any time.

Additional Amazon Web Services infrastructure costs may apply. Use the Amazon Web Services Pricing Calculator to estimate your infrastructure costs.

Vendor refund policy

Returns are currently not supported, but can be cancelled anytime; 请联系tech@houdutech.cn

Legal

Vendor terms and conditions

Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

Content disclaimer

Vendors are responsible for their product descriptions and other product content. Amazon Web Services Marketplace China does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Usage information

Delivery details

64-bit (x86) Amazon Machine Image (AMI)

Amazon Machine Image (AMI)

An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.

Version release notes

Text corpus cleaning library for large model training, supporting functions such as filtering, cleaning, and deduplication of text corpus data
High-performance programming framework compatible with Spark SQL, PySpark, Pandas and other programming interfaces and unified Data + AI
Provides a C++-based distributed computing programming platform to support the development of high-performance data processing modules
Processes massive large-scale graph data and provides a high-performance PageRank isometric calculation algorithm
A machine learning library for massive large-scale data sets, providing traditional machine learning algorithms such as K-Means and KR

Additional details

Usage instructions

Start the cluster
(1), go to CloudFormation, search CloudFormation in the AWS console, go to the CloudFormation homepage (2) to create a stack, select Existing Template -> Upload Existing Template. The address under the template is:
https://chukonu.houdutech.cn/aws-download/cloudformation-chukonu-v1.0.yaml
(3) Set the parameters. In the specified stack details, instanceCount is set to 3, and select the EC2 instance for InstanceType. It is recommended to use m6i.2xlarge, or you can adjust it according to your own actual situation
(4) Other stack options can be submitted according to the default parameters.
Run the test program
(1) Download the test script and program.
Test script download address: https://chukonu.houdutech.cn/aws-download/submit-job.sh
Test program download address: https://chukonu.houdutech.cn/aws-download/word_count.py (2) After the download is complete, it will
Upload submit-job.sh and word_count.py to the ChuKonumaster server user directory, then execute the command sh submit-job.sh to test. Note: After completing the subscription, follow the above steps, and you can use the server without starting the server on the website or EC2 console. See more usage instructions:

https://chukonu.houdutech.cn/aws-download/instructions.txt

Resources

Vendor resources

Official website

Support

Vendor support

Technical support contact information: tech@houdutech.cn

Amazon Web Services infrastructure support

Amazon Web Services Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Get support

Customer reviews

Leave a review

Ratings and reviews

0 ratings

5 star

4 star

3 star

2 star

1 star

0 reviews

No customer reviews yet

Be the first to review this product .