spark表缓存

rayx
2022年6月19日
未分类

实测可以从1s降低到76ms

scala> val df1 = spark.read.json(“/testdata/emp.json”) //需要上传到hdfs
df1: org.apache.spark.sql.DataFrame = [comm: string, deptno: bigint … 6 more fields]

scala> df1.registerTempTable(“emp”)
warning: there was one deprecation warning; re-run with -deprecation for details

scala> spark.sql(“select * from emp”).show

//标记缓存某张表
scala> spark.sqlContext.cacheTable(“emp”)

//清理所有的缓存
scala> spark.sqlContext.clearCache
//清理某张表
scala> spark.sqlContext.uncacheTable(“emp”)

scala> spark.sql(“select * from emp”).show