May 19, 2013

27 criteria to choose analytic tools

What criteria do you use for selection, how do you weight various criteria? Do you think one of two products from different or from same company will suffice? Do you look at bench-marking studies?

A few potential criteria. The starred ones are those that are important to me:
  1. Is it a new product / company, or well established?
  2. Open source? Or free trial available?
  3. Price (*)
  4. Can work with big data (*)
  5. Compatible with other products
  6. Ease of use, GUI offered (*)
  7. Can work in batch / programmable mode (*)
  8. Offers an API
  9. Capability to fetch data on the Internet, or from database (SQL supported)
  10. Nice graphic capabilities (to visualize decision trees, for instance) (*)
  11. Speed of computations, efficiency in the may memory is used (*)
  12. Good technical support, training/documentation available (*)
  13. Local company 
  14. Offer comprehensive set of modern procedures
  15. Offers adds-on or modules at an extra charge, as you needs grow (*)
  16. Platform-independent (*)
  17. Technique-specialized (e.g. time series, spatial data) vs. generalist
  18. Field-specialized (e.g. web log data, health care data)
  19. Compatible with your client (e.g. SAS because your clients use SAS) (*)
  20. Product used by practitioners in same field (quant, econometrics, health care, operations research, etc.)
  21. Handle missing data, data cleaning and auditing functionality, cross-validation 
  22. Support programming language such as C++, Python, Java, Perl rather than internal, ad-hoc language (despite the fact that ad-hoc language might be more efficient) (*)
  23. Learning curve (*)
  24. Easy to upgrade (*)
  25. Possibility to work on the cloud, or use map reduce and NoSQL features
  26. Real time features (can be integrated in a real time system, such as auction bidding) 
  27. Other criteria?
In some cases, three products might be needed: one for visualization, one with a large number of functions (neural networks, decision trees, constrained logistic regression, time series, simplex) and one that will become your "work horse" to produce the bulk of heavy duty analytics.
What are your top criteria? Related questions:
  • When do you decide to upgrade or purchase an additional module to an existing product - e.g. SAS Graph or SAS Access on top of SAS Base? 
  • Who is responsible, in your organisation, to make a final decision on the choice of analytic software? End users? Statisticians? Business people? CTO?

No comments: