Quantcast
Channel: MarsHut
Viewing all articles
Browse latest Browse all 6551

nutch 1.4, solr 3.4 configuration error

$
0
0
I am trying to configure nutch 1.4 with solr 3.4.

I configured everything and when I run the command:

./nutch crawl urls -dir myCrawl2 -solr http://localhost:8080 -depth 2 -topN

I get the following error:

java.io.IOException: Job failed!
SolrDeleteDuplicates: starting at 2013-06-06 15:49:30
SolrDeleteDuplicates: Solr url: http://localhost:8080
Exception in thread "main" java.io.IOException:
org.apache.solr.client.solrj.SolrServerException: Error executing query
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getSplits(SolrDeleteDuplicates.java:200)
at
org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:353)
at org.apache.nutch.crawl.Crawl.run(Crawl.java:153)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)
Caused by: org.apache.solr.client.solrj.SolrServerException: Error
executing query
at
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
at
org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getSplits(SolrDeleteDuplicates.java:198)
... 9 more
Caused by: org.apache.solr.common.SolrException: Not Found

Not Found

request: http://localhost:8080/select?q=id:[* TO
*]=id꺌=1=javabin뷪=2
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
... 11 more

Other possibly helpful information:
1) The solr admin screen comes up fine in the browser.
2) I copied the schema.xml file that came with nutch into my solr core conf
directory
3) Again, nutch will run and crawl everything it's just that when it comes
time to post it to SOLR it throws this error.

I have configured everything I can think of, checked logs, and scoured the
Internet and have not been able to find a solution. If anybody has any
ideas on how I can resolve this I would be incredibly grateful.

Viewing all articles
Browse latest Browse all 6551

Trending Articles