# In-Memory Analytics with Apache Arrow
* Exercises at https://github.com/Harry-Kwon/arrow-sandbox
# skim notes
## Section 1
### ch1 arrow memory format
- **array**: list of values with known length and same tyoe
- **record batch**: group of equal length **arrays** and a schema
- **slot**: value in an **array** specified by index
### ch2 key arrow specs
- read from file system, amazon s3, hdfs, csv
* parallelized csv read
- pandas + arrow
- **chunked array**: wrapper around group of arrow **arrays** of same data type
* incrementally build up an array without allocating memory
- **table**: holds one or more **chunked arrays** and a schema.
* analagous to **record batches**
- sliced buffers for working in parallel
- FFI C headers for zero-copy sharing between languages