Suggestion for more general/performant sentinel for find* functions

I pinged because this seemed pretty critical to me, and was surprised that a change with such large performance repercussions was merged without much more discussion, and because from other posts from Stefan understood that things might be really frozen even this week.

I understand perfectly why 0 is not a good sentinel for all cases, but that’s not a good reason to make a change that hurts performance in such a way, when other alternatives (such as my suggestion) are possible.

On v0.6.2:

julia> ff(z) = findfirst(z, "foobar") == 0
ff (generic function with 1 method)

julia> @btime ff('z')
  6.034 ns (0 allocations: 0 bytes)
false

julia> @code_native ff('z')
	.section	__TEXT,__text,regular,pure_instructions
Filename: REPL[19]
	pushq	%rbp
	movq	%rsp, %rbp
Source line: 1
	movabsq	$findnext, %rax
	movabsq	$4950739376, %rsi       ## imm = 0x1271649B0
	movl	$1, %edx
	callq	*%rax
	testq	%rax, %rax
	sete	%al
	popq	%rbp
	retq
	nopw	(%rax,%rax)

On master: (Version 0.7.0-DEV.3618 (2018-01-28 17:48 UTC), Commit 743d487 (0 days old master))

julia> ff(z) = findfirst(equalto(z), "foobar") == nothing
ff (generic function with 1 method)

julia> @btime ff('z')
  20.847 ns (0 allocations: 0 bytes)
true

julia> @code_native(ff('z'))
	.section	__TEXT,__text,regular,pure_instructions
; Function ff {
; Location: REPL[21]:1
; Function Type; {
; Location: REPL[21]:1
	pushq	%rbx
	subq	$32, %rsp
	movl	%edi, (%rsp)
;}
; Function findfirst; {
; Location: array.jl:1702
	movabsq	$findnext, %rax
	leaq	24(%rsp), %rbx
	leaq	(%rsp), %rsi
	movabsq	$4677124368, %rdx       ## imm = 0x116C74110
	movl	$1, %ecx
	movq	%rbx, %rdi
	callq	*%rax
;}
	movl	%edx, %ecx
	andb	$127, %cl
	cmpb	$2, %cl
	je	L76
	cmpb	$1, %cl
	jne	L105
	movabsq	$"==", %rax
	callq	*%rax
	jmp	L97
; Function findfirst; {
; Location: array.jl:1702
L76:
	testb	%dl, %dl
	cmovnsq	%rbx, %rax
;}
	movq	(%rax), %rdi
	movabsq	$"==", %rax
	callq	*%rax
L97:
	andb	$1, %al
	addq	$32, %rsp
	popq	%rbx
	retq
L105:
	movabsq	$jl_system_image_data, %rax
	movq	%rax, 8(%rsp)
	movabsq	$jl_system_image_data, %rax
	movq	%rax, 16(%rsp)
	movabsq	$jl_apply_generic, %rax
	leaq	8(%rsp), %rdi
	movl	$2, %esi
	callq	*%rax
	ud2
	nop
;}

That’s also a pretty large increase in code size.