Measure Zero


  • About

  • Quotes

  • Notes

  • Sitemap

  • Search

A Wrong Way to Do Cross-Validation

2019-12-12 | ~ | Machine Learning

While this point may seem obvious to the reader, we have seen this blunder committed many times in published papers in top rank journals.

Consider a classification problem with a large number of predictors, as may arise, for example, in genomic or proteomic applications. A typical strategy for analysis might be as follows:

  1. Screen the predictors: find a subset of “good” predictors that show fairly strong (univariate) correlation with the class labels.
  2. Using just this subset of predictors, build a multivariate classifier.
  3. Use cross-validation to estimate the unknown tuning parameters and to estimate the prediction error of the final model.
Read more »

Side Note: Information Entropy, Cross-Entropy and KL Divergence

2019-11-30 | ~ | Mathematics

我们考虑一个事件 $A$, 它发生的概率是 $p$. 假设我们观测到事件 $A$ 发生, 我们希望定义一个信息量 $I(p)$ 来衡量 “$A$ 发生了” 这件事给了我们多少信息.

  1. $I(p)$ 是关于 $p$ 的递减函数. 如果事件发生概率高, 而且它发生了, 我们得到的信息应该比较少, 因为我们认为它确实容易发生, 这不稀奇.
  2. 考虑另一个独立的事件 $B$, 它发生的概率是 $q$, 则 $I(pq) = I(p) + I(q)$. 也就是说我们希望独立事件同时发生时提供的信息量应该是他们分别提供的信息量之和.
Read more »

一次手磨咖啡体验

2019-11-21 | ~ 2020-12-03 | Food and Cooking

上周六 (2019/11/16) 在学校北区咖啡馆体验了一次手磨咖啡, 简单地记录一下. 也叫手冲咖啡.

1. 制作手磨咖啡的流程

主办方提供的器材如图

Read more »

Bootstrap 失效的一个例子

2019-11-08 | ~ | Statistics

假设 $Y_1, \dots, Y_n$ 独立同分布, 服从 $[0,\theta]$ 上的均匀分布. 则其似然函数为

\[L(\theta|Y_1, \dots, Y_n) = \frac{1}{\theta^n} \prod_{k=1}^n 1_{\{ 0\le Y_k\le \theta \}}.\]
Read more »

中位数两则, 线性时间与 leetcode 4

2019-10-21 | ~ 2020-06-07 | Algorithms

找中位数最暴力的方法是先排序再取中位数, 时间复杂度 $O(n\log n)$. 后来才得知中位数有时间复杂度 $O(n)$ 的算法, 事实上任意顺序统计量都可以用 $O(n)$ 时间找出.

Read more »

Lights-Out

2019-09-14 | ~ | Mathematics

Each employee of MegaCorp has a separate office in the MegaCorp office building. Each office is equipped with one overhead light and one toggle switch to turn the light on and off.

Every day, the employees turn on all lights when they come to work. Each evening they turn off all lights when they go home.

One day, the employees arrive to discover that someone has played a rather elaborate hoax on them. Though all looks fine when they come in (all lights are off), every time an employee flicks the switch in her office, this not only toggles the light in her office, but also the lights in the offices of all of her friends. (Friendship is a symmetric relationship.)

The question: does there necessarily exist an arrangement of the switches that will turn all lights simultaneously on (so that work can begin)? Prove your answer.

Read more »

Super Egg Drop

2019-09-13 | ~ | Mathematics

You are given $k$ eggs, and you have access to a building with $N$ floors from $1$ to $N$.

Each egg is identical in function, and if an egg breaks, you cannot drop it again.

You know that there exists a floor $F$ with $0 \le F \le N$ such that any egg dropped at a floor higher than $F$ will break, and any egg dropped at or below floor $F$ will not break.

Each move, you may take an egg (if you have an unbroken one) and drop it from any floor $X$ (with $1 \le X \le N$).

Your goal is to know with certainty what the value of $F$ is.

What is the minimum number of moves that you need to know with certainty what $F$ is, regardless of the initial value of $F$?

Read more »

宝可梦对战入门资料集

2019-09-13 | ~ 2021-03-21 | Games

仅仅是一些材料的堆砌, 包括 Pokémon Showdown 上的 66 单打 (gen7) 以及更习见的 VGC 64 双打.

  • Gen 8 变动
    • Opinion: 10 Mechanics Changes for Pokémon Sword and Shield
    • Pokémon Sword and Shield – New Competitive Features and Mechanic Changes
  • 百科
    • 神奇宝贝百科
    • 口袋百科
Read more »

英语杂录

2019-09-08 | ~ 2022-09-06 | Language

工具网站

  • Vocabulary.com. 释义有趣, 例句分领域.
  • Oxford, Merriam-Webster, Collins. 我用得最多的三个字典, “网易有道词典” 整合了它们, 也是极其好用的 app.
  • Longman 的特色是可以看搭配.

Handle vs handler

2022/9/6

参考 In programming, what is the difference between a handle and a handler? - Quora

Transparent

参考 meaning in context - What is the correct interpretation of transparent? - English Language Learners Stack Exchange

Read more »

一次阅读马拉松经历

2019-09-06 | ~ | Miscellanea

关于阅马

阅读马拉松是由 TELL 发起, 自身独立运营的阅读比赛, 旨在用简单有趣的方式推广阅读. 参与者需要在规定时间内读完一本书, 并达到一定的阅读质量, 其实就是做一些 “阅读理解” 选择题, 以阅读时间 + 错题罚时来判定成绩.

注: TELL 由 think, enjoy, live, link 首字母组合而成, 是一家致力于研究和传播故事讲述的艺术与技术的机构.

活动体验非常糟糕.

Read more »
1 … 16 17 18
Shiina

Shiina

知乎 豆瓣 bangumi Instagram Weibo
Creative Commons
RSS
© 2019 - 2025   Shiina   CC BY-NC-ND 4.0
RSS  
Powered by Jekyll
 
Theme NexT.Mist